Skip to content

refactor: move spamoor benchmark into testify suite#3107

Merged
chatton merged 11 commits intomainfrom
cian/bench-refactor
Mar 2, 2026
Merged

refactor: move spamoor benchmark into testify suite#3107
chatton merged 11 commits intomainfrom
cian/bench-refactor

Conversation

@chatton
Copy link
Contributor

@chatton chatton commented Feb 25, 2026

Overview

Introduces a benchmarking package within the e2e dir a testsuite we can add additional spamoor tests too.

Summary by CodeRabbit

  • Tests

    • Added comprehensive end-to-end benchmark test suite for load testing and performance measurement with tracing integration.
    • Reorganized benchmark tests into dedicated test directory with enhanced metrics collection and output utilities.
  • Refactor

    • Refactored shared test utilities into public APIs for broader reuse across test suites.
    • Updated test infrastructure to support trace span extraction and validation.
  • Chores

    • Updated CI/CD workflow to execute benchmarks on relevant code changes with artifact management for benchmark data.

…mark

- Create test/e2e/benchmark/ subpackage with SpamoorSuite (testify/suite)
- Move spamoor smoke test into suite as TestSpamoorSmoke
- Split helpers into focused files: traces.go, output.go, metrics.go
- Introduce resultWriter for defer-based benchmark JSON output
- Export shared symbols from evm_test_common.go for cross-package use
- Restructure CI to fan-out benchmark jobs and fan-in publishing
- Run benchmarks on PRs only when benchmark-related files change
Resolve conflicts keeping the benchmark suite refactoring:
- benchmark.yml: keep path filters and suite-style test command
- evm_spamoor_smoke_test.go: keep deleted (moved to benchmark pkg)
- evm_test_common.go: keep exported types, drop writeTraceBenchmarkJSON
  (now in benchmark/output.go)
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This pull request reorganizes and refactors the Spamoor smoke test from a standalone file into a dedicated test/e2e/benchmark/ directory structure. It exports previously private test helper functions and types from evm_test_common.go, adds new benchmarking utilities for trace parsing and result output, updates the CI workflow with path-based triggers and adjusted artifact paths, and deletes the legacy test file while moving its functionality into the new modular structure.

Changes

Cohort / File(s) Summary
CI Workflow Updates
.github/workflows/benchmark.yml
Added path filters for PR triggers, updated artifact paths from test/e2e/ to test/e2e/benchmark/, adjusted Spamoor test invocation to include both suite and smoke variants, and modified working directories for benchmark execution.
New Benchmark Test Suite Infrastructure
test/e2e/benchmark/suite_test.go
Introduced SpamoorSuite test suite with Docker-based environment setup, including Jaeger tracing, reth node with OTLP, sequencer, and Spamoor node; added SetupTest() lifecycle and collectServiceTraces() helper for retrieving traces from services.
New Benchmark Test Implementation
test/e2e/benchmark/spamoor_smoke_test.go
Added TestSpamoorSmoke() test method that configures spammers (EOATX and GasBurnerTX), collects metrics and traces from ev-node-smoke and ev-reth, validates expected spans, and asserts on transaction outcomes.
New Benchmark Helper Utilities
test/e2e/benchmark/metrics.go, test/e2e/benchmark/output.go, test/e2e/benchmark/traces.go
Added polling and host readiness helpers, Prometheus counter aggregation, JSON benchmark result writing, Jaeger span extraction/conversion, and span name/duration validation utilities for e2e benchmarks.
Test Common API Exports
test/e2e/evm_test_common.go
Exported previously private functions (setupSequencerNodeSetupSequencerNode, setupCommonEVMEnvSetupCommonEVMEnv, aggregateSpanStatsAggregateSpanStats, printTraceReportPrintTraceReport) and types (traceSpanTraceSpan, spanStatsSpanStats with renamed fields); removed legacy benchmarkEntry and writeTraceBenchmarkJSON.
Test File API Updates
test/e2e/evm_contract_bench_test.go, test/e2e/evm_force_inclusion_e2e_test.go, test/e2e/evm_full_node_e2e_test.go, test/e2e/failover_e2e_test.go
Updated function calls to use exported versions of setup helpers (SetupCommonEVMEnv, SetupSequencerNode) and updated interface references to renamed TraceSpan type.
Removed Legacy Test
test/e2e/evm_spamoor_smoke_test.go
Deleted entire file (287 lines) containing original Spamoor smoke test, helper types (jaegerSpan), and utility functions; functionality migrated to new test/e2e/benchmark/ structure.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

T:testing, T:code-hygiene

Suggested reviewers

  • alpe
  • auricom
  • julienrbrt

Poem

🐰 Tests hop to a new home so neat,
With traces and benchmarks all set to compete,
In benchmark/ they gather and shine,
Public APIs exported, refactored so fine,
Spamoor smoke dances to a rhythm divine! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 78.26% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The description provides a brief overview but lacks detail about implementation, changes made, and specific components added. It references adding a benchmarking package and testify suite but does not fully explain the scope of modifications. Expand the description to clarify what components were added/refactored, why the testify suite improves the structure, and how this facilitates adding future tests.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the main change: moving the Spamoor benchmark test into a testify suite structure within a new benchmarking package.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cian/bench-refactor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 25, 2026

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedMar 2, 2026, 10:21 AM

@claude
Copy link
Contributor

claude bot commented Feb 25, 2026

Claude finished @chatton's task in 2m 46s —— View job


Code Review: refactor: move spamoor benchmark into testify suite

  • Read changed files
  • Review code quality & architecture
  • Review error handling & resilience
  • Review security
  • Review performance & resource efficiency
  • Review testing & reliability
  • Review documentation
  • Post final review

Overall this is a clean refactor that improves the benchmark structure meaningfully. The testify suite approach, isolated Docker environments per test, and proper cleanup are all good patterns. A few issues worth addressing before merge.


Medium: t.Context() used inside t.Cleanup — context already cancelled

suite_test.go:67 and suite_test.go:115:

t.Cleanup(func() { _ = jg.Remove(t.Context()) })   // line 67
t.Cleanup(func() { _ = spNode.Remove(t.Context()) }) // line 115

t.Context() is cancelled when the test function returns. Cleanup functions run after the test function returns, so the context passed to Remove is already done. Docker client calls receiving a cancelled context may fail immediately and silently (the _ discards the error). Use context.Background() with a short timeout instead:

t.Cleanup(func() {
    ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
    defer cancel()
    _ = jg.Remove(ctx)
})

Fix this →


Minor: Doc comments still reference old unexported names

evm_test_common.go:306: // setupSequencerNode initializes... → should be // SetupSequencerNode initializes...

evm_test_common.go:532: // SetupCommonEVMEnv creates... is correct ✓ (this one was already fixed).

The comment block at line 306 starts with the old unexported name. Per Go conventions the first word of a doc comment must be the exported symbol name. This will cause golangci-lint (godot/revive) to flag it.

Fix this →


Minor: Mixed-service span aggregation loses per-service attribution

spamoor_smoke_test.go:90:

w.addSpans(append(evNodeSpans, evRethSpans...))

addSpans groups by span name via AggregateSpanStats. If both services happen to emit a span with the same name (e.g. execute_tx) their timings are merged into one bucket, making it impossible to tell which service is slower. Both services happen to have distinct span names right now, but this is fragile as more spans are added.

Consider calling addSpans separately per service, or label spans with their source. Alternatively, document the assumption that span names are unique across services.


Positive: Things done well

  • output.go:66-81: The flush() deferred method correctly uses t.Logf for soft errors instead of require.NoError, which would call runtime.Goexit from a deferred function after test completion. This is the right approach.

  • metrics.go:19: Correctly adopted require.Eventually (addressing the earlier CodeRabbit nitpick) — clean and cancellable.

  • Test isolation: Each SetupTest() call creates a fresh Docker client and network ID, preventing test interference. Good pattern.

  • Closure variable capture: spamoor_smoke_test.go:70-72 correctly shadows idToDelete to avoid the classic loop-variable closure bug.

  • Context in collectServiceTraces: Uses an explicit context.WithTimeout with 3-minute bound on a method that could block on network I/O — good practice.

  • .just/test.just: The grep -v /benchmark exclusion of benchmark tests from test-e2e is a good guard to avoid running expensive Docker benchmarks in the regular e2e suite.

  • CI workflow path filters: The block/internal/executing/** path addition ensures block executor benchmark changes still trigger the workflow.


FYI: Acknowledged issues (no action needed)

  • time.Sleep(60s) in the smoke test — acceptable given the confirmed need to let Spamoor generate traffic before querying Jaeger.
  • rethTag = "pr-140" with EV_RETH_TAG env override and TODO comment — clear enough for a transitional state.
  • OTLP endpoint configuration (OTEL_EXPORTER_OTLP_ENDPOINT with /v1/traces suffix) — author confirmed manual testing with both reth versions, so the non-spec-compliant-but-working config is acceptable for now.

defer w.flush()

e := s.setupEnv(config{
rethTag: "pr-140",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the PR the traces were added, we can change this to a proper tag in a follow up.

@chatton chatton marked this pull request as ready for review February 25, 2026 12:30
pull_request:
branches:
- main
paths:
Copy link
Contributor Author

@chatton chatton Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only run benchmark tests for PRs when relevant files are changed.

go test sets the working directory to the package under test, so the
env var should be relative to test/e2e/benchmark/, not test/e2e/.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
test/e2e/benchmark/spamoor_smoke_test.go (1)

20-23: ⚠️ Potential issue | 🟠 Major

Hardcoded rethTag: "pr-140" will become stale and cause opaque failures.

This tag refers to a specific container image built for a PR. Once that image is cleaned up or the tag is overwritten, the test will silently fail to start or pull the container with a confusing error. Expose this via an environment variable or CLI flag so it can be updated without a code change.

🔧 Suggested refactor
+rethTagFromEnv := os.Getenv("BENCH_RETH_TAG")
+if rethTagFromEnv == "" {
+    rethTagFromEnv = "pr-140"
+}
 e := s.setupEnv(config{
-    rethTag:     "pr-140",
+    rethTag:     rethTagFromEnv,
     serviceName: "ev-node-smoke",
 })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/spamoor_smoke_test.go` around lines 20 - 23, Replace the
hardcoded rethTag value ("pr-140") in the test setup by reading it from
configuration (environment variable or test CLI flag) so tests don't break when
image tags change: update the call site where s.setupEnv(config{ rethTag:
"pr-140", ... }) is used in spamoor_smoke_test.go to instead pull the tag from
an env var (eg. os.Getenv("RETH_TAG")) or a provided flag, and adjust the config
struct and any test bootstrap that constructs config to accept and pass that
dynamic value (look for the config type, its rethTag field, and the setupEnv
function to make the change).
🧹 Nitpick comments (5)
test/e2e/benchmark/metrics.go (1)

15-34: Consider accepting a context.Context for cancellable polling.

requireHTTP drives a blocking polling loop with no way to interrupt it if the parent test is cancelled mid-flight (e.g., via t.Context() deadline or a FailNow from another goroutine). For long timeouts this can leave dangling goroutines/blocked tests.

🔧 Suggested refactor
-func requireHTTP(t testing.TB, url string, timeout time.Duration) {
+func requireHTTP(t testing.TB, ctx context.Context, url string, timeout time.Duration) {
 	t.Helper()
 	client := &http.Client{Timeout: 200 * time.Millisecond}
-	deadline := time.Now().Add(timeout)
+	ctx, cancel := context.WithTimeout(ctx, timeout)
+	defer cancel()
 	var lastErr error
-	for time.Now().Before(deadline) {
+	ticker := time.NewTicker(100 * time.Millisecond)
+	defer ticker.Stop()
+	for {
+		select {
+		case <-ctx.Done():
+			t.Fatalf("daemon not ready at %s: %v", url, lastErr)
+			return
+		case <-ticker.C:
+		}
 		resp, err := client.Get(url)
 		if err == nil {
 			_ = resp.Body.Close()
 			if resp.StatusCode >= 200 && resp.StatusCode < 300 {
 				return
 			}
 			lastErr = fmt.Errorf("status %d", resp.StatusCode)
 		} else {
 			lastErr = err
 		}
-		time.Sleep(100 * time.Millisecond)
 	}
-	t.Fatalf("daemon not ready at %s: %v", url, lastErr)
 }

As per coding guidelines, "Use context.Context for cancellation in Go."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/metrics.go` around lines 15 - 34, The requireHTTP function
currently blocks and can't be cancelled; change its signature to accept a
context.Context (e.g., requireHTTP(ctx context.Context, t testing.TB, url
string, timeout time.Duration) or use t.Context() inside) and use that context
for cancellation inside the polling loop: create requests with
http.NewRequestWithContext or attach ctx to the client.Do call, replace
time.Sleep with a select on ctx.Done() and time.After(100*time.Millisecond),
check ctx.Err() and call t.Fatalf when the context is cancelled, and ensure you
still honor the timeout logic and preserve the lastErr handling in the loop.
test/e2e/benchmark/spamoor_smoke_test.go (1)

68-70: Fixed 60s time.Sleep is fragile for trace accumulation.

On fast machines traces may be ready much sooner; on slow CI runners traces may not be ready even after 60s. Consider polling the Jaeger API with a timeout instead of an unconditional sleep, making the test both faster in the happy path and more robust in the slow path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/spamoor_smoke_test.go` around lines 68 - 70, Replace the
fixed time.Sleep(60 * time.Second) wait with a polling loop that queries the
Jaeger API until the expected traces/spans from spamoor appear or a context
deadline elapses; implement this by replacing the direct call to time.Sleep with
a function (e.g., waitForTraces(ctx, serviceName, expectedSpanCount)) that uses
a context with timeout, exponential/backoff retries, and HTTP calls to the
Jaeger query endpoint, returning early when traces are observed and failing the
test only if the timeout is reached—update the test to call waitForTraces
instead of time.Sleep and use the existing service identifier used in the test
to locate traces.
test/e2e/benchmark/traces.go (1)

23-47: Loop variable t shadows the conventional test-helper name convention.

Using t as a range variable in extractSpansFromTraces (which imports "testing") is not a bug but is slightly confusing, particularly for readers accustomed to t meaning testing.TB. A more descriptive name like traceEntry or raw would improve clarity.

🔧 Suggested refactor
-	for _, t := range traces {
-		traceMap, ok := t.(map[string]any)
+	for _, traceEntry := range traces {
+		traceMap, ok := traceEntry.(map[string]any)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/traces.go` around lines 23 - 47, In
extractSpansFromTraces, rename the range variable t to a clearer name (e.g.,
traceEntry or rawTrace) to avoid shadowing the conventional testing variable t;
update all uses of t in that function (the outer loop in extractSpansFromTraces
and its casts to traceMap) so the logic remains unchanged but the variable is
more descriptive and less confusing to readers familiar with testing.TB.
test/e2e/benchmark/output.go (1)

67-76: require.NoError inside a deferred flush will call runtime.Goexit on failure, which is safe but worth documenting.

When flush is called via defer after a test has already failed, require.NoErrorFailNowruntime.Goexit is invoked from within the deferred function, which Go handles correctly (deferred calls complete). No behavioral bug here, but the interaction with require vs os.WriteFile returning an error silently (by being suppressed) could be surprising. Consider logging the error path rather than asserting fatally so benchmark output failures don't obscure the real test failure.

🔧 Suggested refactor
 func (w *resultWriter) flush() {
 	outputPath := os.Getenv("BENCH_JSON_OUTPUT")
 	if outputPath == "" || len(w.entries) == 0 {
 		return
 	}
 
 	data, err := json.MarshalIndent(w.entries, "", "  ")
-	require.NoError(w.t, err, "failed to marshal benchmark JSON")
-	require.NoError(w.t, os.WriteFile(outputPath, data, 0644), "failed to write benchmark JSON to %s", outputPath)
+	if err != nil {
+		w.t.Logf("WARNING: failed to marshal benchmark JSON: %v", err)
+		return
+	}
+	if err := os.WriteFile(outputPath, data, 0644); err != nil {
+		w.t.Logf("WARNING: failed to write benchmark JSON to %s: %v", outputPath, err)
+		return
+	}
 	w.t.Logf("wrote %d benchmark entries to %s", len(w.entries), outputPath)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/benchmark/output.go` around lines 67 - 76, In resultWriter.flush,
avoid using require.NoError (which calls FailNow/Goexit when deferred) for the
json.MarshalIndent and os.WriteFile errors; instead check the error values from
json.MarshalIndent and os.WriteFile and log them non-fatally (e.g., use
w.t.Errorf or w.t.Logf with descriptive messages referencing outputPath and the
error) so flush reports failures without invoking runtime.Goexit from a deferred
context; update the calls that currently use require.NoError(w.t, ...) to
explicit if err != nil { w.t.Errorf("...: %v", err); return } or similar logging
behavior for both the marshal and write steps in resultWriter.flush.
test/e2e/evm_test_common.go (1)

532-534: Doc comment references the old unexported name.

Line 532 says setupCommonEVMEnv but the function is now exported as SetupCommonEVMEnv. Per Go convention, the doc comment should start with the exported identifier name.

Proposed fix
-// setupCommonEVMEnv creates and initializes ev-reth instances, while also initializing the local ev-node instance
+// SetupCommonEVMEnv creates and initializes ev-reth instances, while also initializing the local ev-node instance
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/evm_test_common.go` around lines 532 - 534, The doc comment
currently starts with the old unexported name; update the comment to begin with
the exported function name SetupCommonEVMEnv so it follows Go doc conventions
(e.g., change the leading sentence from "setupCommonEVMEnv creates..." to
"SetupCommonEVMEnv creates..."), leaving the rest of the description intact and
ensuring the comment now documents the exported SetupCommonEVMEnv function.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/benchmark.yml:
- Around line 11-16: Update the PR path filter in the GitHub Actions workflow to
include the block executor benchmark sources so changes to that package trigger
the workflow; specifically add the path pattern 'block/internal/executing/**'
(or more broadly 'block/**' if preferred) to the existing paths array in
.github/workflows/benchmark.yml so the job that runs `go test
./block/internal/executing/...` will trigger on relevant PRs.

In `@test/e2e/benchmark/suite_test.go`:
- Around line 73-78: The OTLP env setup in the b.WithTag(...).WithEnv call is
wrong: change the value for "OTEL_EXPORTER_OTLP_ENDPOINT" to use the base ingest
URL (remove the "/v1/traces" suffix) so the SDK can append per-signal paths,
keep "OTEL_EXPORTER_OTLP_TRACES_ENDPOINT" as-is (full URL), and update
"OTEL_EXPORTER_OTLP_PROTOCOL" and "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL" from
"http" to the spec-correct "http/protobuf" (or another valid spec value like
"grpc" or "http/json" if intended).

---

Duplicate comments:
In `@test/e2e/benchmark/spamoor_smoke_test.go`:
- Around line 20-23: Replace the hardcoded rethTag value ("pr-140") in the test
setup by reading it from configuration (environment variable or test CLI flag)
so tests don't break when image tags change: update the call site where
s.setupEnv(config{ rethTag: "pr-140", ... }) is used in spamoor_smoke_test.go to
instead pull the tag from an env var (eg. os.Getenv("RETH_TAG")) or a provided
flag, and adjust the config struct and any test bootstrap that constructs config
to accept and pass that dynamic value (look for the config type, its rethTag
field, and the setupEnv function to make the change).

---

Nitpick comments:
In `@test/e2e/benchmark/metrics.go`:
- Around line 15-34: The requireHTTP function currently blocks and can't be
cancelled; change its signature to accept a context.Context (e.g.,
requireHTTP(ctx context.Context, t testing.TB, url string, timeout
time.Duration) or use t.Context() inside) and use that context for cancellation
inside the polling loop: create requests with http.NewRequestWithContext or
attach ctx to the client.Do call, replace time.Sleep with a select on ctx.Done()
and time.After(100*time.Millisecond), check ctx.Err() and call t.Fatalf when the
context is cancelled, and ensure you still honor the timeout logic and preserve
the lastErr handling in the loop.

In `@test/e2e/benchmark/output.go`:
- Around line 67-76: In resultWriter.flush, avoid using require.NoError (which
calls FailNow/Goexit when deferred) for the json.MarshalIndent and os.WriteFile
errors; instead check the error values from json.MarshalIndent and os.WriteFile
and log them non-fatally (e.g., use w.t.Errorf or w.t.Logf with descriptive
messages referencing outputPath and the error) so flush reports failures without
invoking runtime.Goexit from a deferred context; update the calls that currently
use require.NoError(w.t, ...) to explicit if err != nil { w.t.Errorf("...: %v",
err); return } or similar logging behavior for both the marshal and write steps
in resultWriter.flush.

In `@test/e2e/benchmark/spamoor_smoke_test.go`:
- Around line 68-70: Replace the fixed time.Sleep(60 * time.Second) wait with a
polling loop that queries the Jaeger API until the expected traces/spans from
spamoor appear or a context deadline elapses; implement this by replacing the
direct call to time.Sleep with a function (e.g., waitForTraces(ctx, serviceName,
expectedSpanCount)) that uses a context with timeout, exponential/backoff
retries, and HTTP calls to the Jaeger query endpoint, returning early when
traces are observed and failing the test only if the timeout is reached—update
the test to call waitForTraces instead of time.Sleep and use the existing
service identifier used in the test to locate traces.

In `@test/e2e/benchmark/traces.go`:
- Around line 23-47: In extractSpansFromTraces, rename the range variable t to a
clearer name (e.g., traceEntry or rawTrace) to avoid shadowing the conventional
testing variable t; update all uses of t in that function (the outer loop in
extractSpansFromTraces and its casts to traceMap) so the logic remains unchanged
but the variable is more descriptive and less confusing to readers familiar with
testing.TB.

In `@test/e2e/evm_test_common.go`:
- Around line 532-534: The doc comment currently starts with the old unexported
name; update the comment to begin with the exported function name
SetupCommonEVMEnv so it follows Go doc conventions (e.g., change the leading
sentence from "setupCommonEVMEnv creates..." to "SetupCommonEVMEnv creates..."),
leaving the rest of the description intact and ensuring the comment now
documents the exported SetupCommonEVMEnv function.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 212ac08 and cc56590.

📒 Files selected for processing (12)
  • .github/workflows/benchmark.yml
  • test/e2e/benchmark/metrics.go
  • test/e2e/benchmark/output.go
  • test/e2e/benchmark/spamoor_smoke_test.go
  • test/e2e/benchmark/suite_test.go
  • test/e2e/benchmark/traces.go
  • test/e2e/evm_contract_bench_test.go
  • test/e2e/evm_force_inclusion_e2e_test.go
  • test/e2e/evm_full_node_e2e_test.go
  • test/e2e/evm_spamoor_smoke_test.go
  • test/e2e/evm_test_common.go
  • test/e2e/failover_e2e_test.go
💤 Files with no reviewable changes (1)
  • test/e2e/evm_spamoor_smoke_test.go

go test treats all arguments after an unknown flag (--evm-binary) as
test binary args, so ./benchmark/ was never recognized as a package
pattern.
@chatton chatton marked this pull request as draft February 25, 2026 13:44
go test sets the cwd to the package directory (test/e2e/benchmark/),
so the binary path needs an extra parent traversal.
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Spamoor Trace Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.50.

Benchmark suite Current: 8752fee Previous: f73a124 Ratio
SpamoorSmoke - Executor.SetFinal (avg) 2078.176646706587 us 1257 us 1.65

This comment was automatically generated by workflow using github-action-benchmark.

The benchmark package doesn't define the --binary flag that test-e2e
passes. It has its own CI workflow so it doesn't need to run here.
@codecov
Copy link

codecov bot commented Feb 25, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.95%. Comparing base (f73a124) to head (8752fee).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3107      +/-   ##
==========================================
+ Coverage   60.90%   60.95%   +0.05%     
==========================================
  Files         113      113              
  Lines       11617    11617              
==========================================
+ Hits         7075     7081       +6     
+ Misses       3743     3737       -6     
  Partials      799      799              
Flag Coverage Δ
combined 60.95% <ø> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@chatton chatton marked this pull request as ready for review February 26, 2026 09:27
Copy link
Contributor

@alpe alpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. 👍
utack

client := &http.Client{Timeout: 200 * time.Millisecond}
deadline := time.Now().Add(timeout)
var lastErr error
for time.Now().Before(deadline) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personal preference: require.Eventually()

)

// requireHTTP polls a URL until it returns a 2xx status code or the timeout expires.
func requireHTTP(t testing.TB, url string, timeout time.Duration) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: naming is hard. how about requireHostUp

chatton added 2 commits March 2, 2026 09:36
- make reth tag configurable via EV_RETH_TAG env var (default pr-140)
- fix OTLP config: remove duplicate env vars, use http/protobuf protocol
- use require.Eventually for host readiness polling
- rename requireHTTP to requireHostUp
- use non-fatal logging in resultWriter.flush deferred context
- fix stale doc comment (setupCommonEVMEnv -> SetupCommonEVMEnv)
- rename loop variable to avoid shadowing testing.TB convention
- add block/internal/executing/** to CI path trigger
- remove unused require import from output.go
# Conflicts:
#	scripts/test.mk
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/e2e/evm_test_common.go (1)

306-318: Align the exported function comment with SetupSequencerNode.

Line 306 still starts with setupSequencerNode while the exported symbol at Line 318 is SetupSequencerNode. Please update the comment prefix to match the exported name for doc/lint consistency.

As per coding guidelines: **/*.go: Document exported types and functions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/evm_test_common.go` around lines 306 - 318, The function doc comment
currently starts with the unexported name `setupSequencerNode` but the exported
function is `SetupSequencerNode`; update the comment's first line to begin with
`SetupSequencerNode` so the doc comment matches the exported symbol and
satisfies the Go linter rules for exported functions (refer to the comment block
above the `SetupSequencerNode` function).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/e2e/evm_test_common.go`:
- Around line 306-318: The function doc comment currently starts with the
unexported name `setupSequencerNode` but the exported function is
`SetupSequencerNode`; update the comment's first line to begin with
`SetupSequencerNode` so the doc comment matches the exported symbol and
satisfies the Go linter rules for exported functions (refer to the comment block
above the `SetupSequencerNode` function).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 85c9d2d and 1c3b560.

📒 Files selected for processing (8)
  • .github/workflows/benchmark.yml
  • test/e2e/benchmark/metrics.go
  • test/e2e/benchmark/output.go
  • test/e2e/benchmark/spamoor_smoke_test.go
  • test/e2e/benchmark/suite_test.go
  • test/e2e/benchmark/traces.go
  • test/e2e/evm_contract_bench_test.go
  • test/e2e/evm_test_common.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • test/e2e/benchmark/spamoor_smoke_test.go
  • test/e2e/benchmark/suite_test.go
  • test/e2e/benchmark/metrics.go

@chatton chatton added this pull request to the merge queue Mar 2, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 2, 2026
@chatton chatton added this pull request to the merge queue Mar 2, 2026
Merged via the queue into main with commit fe99227 Mar 2, 2026
50 of 51 checks passed
@chatton chatton deleted the cian/bench-refactor branch March 2, 2026 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants