refactor: move spamoor benchmark into testify suite#3107
Conversation
…mark - Create test/e2e/benchmark/ subpackage with SpamoorSuite (testify/suite) - Move spamoor smoke test into suite as TestSpamoorSmoke - Split helpers into focused files: traces.go, output.go, metrics.go - Introduce resultWriter for defer-based benchmark JSON output - Export shared symbols from evm_test_common.go for cross-package use - Restructure CI to fan-out benchmark jobs and fan-in publishing - Run benchmarks on PRs only when benchmark-related files change
Resolve conflicts keeping the benchmark suite refactoring: - benchmark.yml: keep path filters and suite-style test command - evm_spamoor_smoke_test.go: keep deleted (moved to benchmark pkg) - evm_test_common.go: keep exported types, drop writeTraceBenchmarkJSON (now in benchmark/output.go)
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis pull request reorganizes and refactors the Spamoor smoke test from a standalone file into a dedicated Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).
|
|
Claude finished @chatton's task in 2m 46s —— View job Code Review:
|
| defer w.flush() | ||
|
|
||
| e := s.setupEnv(config{ | ||
| rethTag: "pr-140", |
There was a problem hiding this comment.
the PR the traces were added, we can change this to a proper tag in a follow up.
| pull_request: | ||
| branches: | ||
| - main | ||
| paths: |
There was a problem hiding this comment.
only run benchmark tests for PRs when relevant files are changed.
go test sets the working directory to the package under test, so the env var should be relative to test/e2e/benchmark/, not test/e2e/.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
test/e2e/benchmark/spamoor_smoke_test.go (1)
20-23:⚠️ Potential issue | 🟠 MajorHardcoded
rethTag: "pr-140"will become stale and cause opaque failures.This tag refers to a specific container image built for a PR. Once that image is cleaned up or the tag is overwritten, the test will silently fail to start or pull the container with a confusing error. Expose this via an environment variable or CLI flag so it can be updated without a code change.
🔧 Suggested refactor
+rethTagFromEnv := os.Getenv("BENCH_RETH_TAG") +if rethTagFromEnv == "" { + rethTagFromEnv = "pr-140" +} e := s.setupEnv(config{ - rethTag: "pr-140", + rethTag: rethTagFromEnv, serviceName: "ev-node-smoke", })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/benchmark/spamoor_smoke_test.go` around lines 20 - 23, Replace the hardcoded rethTag value ("pr-140") in the test setup by reading it from configuration (environment variable or test CLI flag) so tests don't break when image tags change: update the call site where s.setupEnv(config{ rethTag: "pr-140", ... }) is used in spamoor_smoke_test.go to instead pull the tag from an env var (eg. os.Getenv("RETH_TAG")) or a provided flag, and adjust the config struct and any test bootstrap that constructs config to accept and pass that dynamic value (look for the config type, its rethTag field, and the setupEnv function to make the change).
🧹 Nitpick comments (5)
test/e2e/benchmark/metrics.go (1)
15-34: Consider accepting acontext.Contextfor cancellable polling.
requireHTTPdrives a blocking polling loop with no way to interrupt it if the parent test is cancelled mid-flight (e.g., viat.Context()deadline or aFailNowfrom another goroutine). For long timeouts this can leave dangling goroutines/blocked tests.🔧 Suggested refactor
-func requireHTTP(t testing.TB, url string, timeout time.Duration) { +func requireHTTP(t testing.TB, ctx context.Context, url string, timeout time.Duration) { t.Helper() client := &http.Client{Timeout: 200 * time.Millisecond} - deadline := time.Now().Add(timeout) + ctx, cancel := context.WithTimeout(ctx, timeout) + defer cancel() var lastErr error - for time.Now().Before(deadline) { + ticker := time.NewTicker(100 * time.Millisecond) + defer ticker.Stop() + for { + select { + case <-ctx.Done(): + t.Fatalf("daemon not ready at %s: %v", url, lastErr) + return + case <-ticker.C: + } resp, err := client.Get(url) if err == nil { _ = resp.Body.Close() if resp.StatusCode >= 200 && resp.StatusCode < 300 { return } lastErr = fmt.Errorf("status %d", resp.StatusCode) } else { lastErr = err } - time.Sleep(100 * time.Millisecond) } - t.Fatalf("daemon not ready at %s: %v", url, lastErr) }As per coding guidelines, "Use context.Context for cancellation in Go."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/benchmark/metrics.go` around lines 15 - 34, The requireHTTP function currently blocks and can't be cancelled; change its signature to accept a context.Context (e.g., requireHTTP(ctx context.Context, t testing.TB, url string, timeout time.Duration) or use t.Context() inside) and use that context for cancellation inside the polling loop: create requests with http.NewRequestWithContext or attach ctx to the client.Do call, replace time.Sleep with a select on ctx.Done() and time.After(100*time.Millisecond), check ctx.Err() and call t.Fatalf when the context is cancelled, and ensure you still honor the timeout logic and preserve the lastErr handling in the loop.test/e2e/benchmark/spamoor_smoke_test.go (1)
68-70: Fixed 60stime.Sleepis fragile for trace accumulation.On fast machines traces may be ready much sooner; on slow CI runners traces may not be ready even after 60s. Consider polling the Jaeger API with a timeout instead of an unconditional sleep, making the test both faster in the happy path and more robust in the slow path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/benchmark/spamoor_smoke_test.go` around lines 68 - 70, Replace the fixed time.Sleep(60 * time.Second) wait with a polling loop that queries the Jaeger API until the expected traces/spans from spamoor appear or a context deadline elapses; implement this by replacing the direct call to time.Sleep with a function (e.g., waitForTraces(ctx, serviceName, expectedSpanCount)) that uses a context with timeout, exponential/backoff retries, and HTTP calls to the Jaeger query endpoint, returning early when traces are observed and failing the test only if the timeout is reached—update the test to call waitForTraces instead of time.Sleep and use the existing service identifier used in the test to locate traces.test/e2e/benchmark/traces.go (1)
23-47: Loop variabletshadows the conventional test-helper name convention.Using
tas a range variable inextractSpansFromTraces(which imports"testing") is not a bug but is slightly confusing, particularly for readers accustomed totmeaningtesting.TB. A more descriptive name liketraceEntryorrawwould improve clarity.🔧 Suggested refactor
- for _, t := range traces { - traceMap, ok := t.(map[string]any) + for _, traceEntry := range traces { + traceMap, ok := traceEntry.(map[string]any)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/benchmark/traces.go` around lines 23 - 47, In extractSpansFromTraces, rename the range variable t to a clearer name (e.g., traceEntry or rawTrace) to avoid shadowing the conventional testing variable t; update all uses of t in that function (the outer loop in extractSpansFromTraces and its casts to traceMap) so the logic remains unchanged but the variable is more descriptive and less confusing to readers familiar with testing.TB.test/e2e/benchmark/output.go (1)
67-76:require.NoErrorinside a deferredflushwill callruntime.Goexiton failure, which is safe but worth documenting.When
flushis called viadeferafter a test has already failed,require.NoError→FailNow→runtime.Goexitis invoked from within the deferred function, which Go handles correctly (deferred calls complete). No behavioral bug here, but the interaction withrequirevsos.WriteFilereturning an error silently (by being suppressed) could be surprising. Consider logging the error path rather than asserting fatally so benchmark output failures don't obscure the real test failure.🔧 Suggested refactor
func (w *resultWriter) flush() { outputPath := os.Getenv("BENCH_JSON_OUTPUT") if outputPath == "" || len(w.entries) == 0 { return } data, err := json.MarshalIndent(w.entries, "", " ") - require.NoError(w.t, err, "failed to marshal benchmark JSON") - require.NoError(w.t, os.WriteFile(outputPath, data, 0644), "failed to write benchmark JSON to %s", outputPath) + if err != nil { + w.t.Logf("WARNING: failed to marshal benchmark JSON: %v", err) + return + } + if err := os.WriteFile(outputPath, data, 0644); err != nil { + w.t.Logf("WARNING: failed to write benchmark JSON to %s: %v", outputPath, err) + return + } w.t.Logf("wrote %d benchmark entries to %s", len(w.entries), outputPath) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/benchmark/output.go` around lines 67 - 76, In resultWriter.flush, avoid using require.NoError (which calls FailNow/Goexit when deferred) for the json.MarshalIndent and os.WriteFile errors; instead check the error values from json.MarshalIndent and os.WriteFile and log them non-fatally (e.g., use w.t.Errorf or w.t.Logf with descriptive messages referencing outputPath and the error) so flush reports failures without invoking runtime.Goexit from a deferred context; update the calls that currently use require.NoError(w.t, ...) to explicit if err != nil { w.t.Errorf("...: %v", err); return } or similar logging behavior for both the marshal and write steps in resultWriter.flush.test/e2e/evm_test_common.go (1)
532-534: Doc comment references the old unexported name.Line 532 says
setupCommonEVMEnvbut the function is now exported asSetupCommonEVMEnv. Per Go convention, the doc comment should start with the exported identifier name.Proposed fix
-// setupCommonEVMEnv creates and initializes ev-reth instances, while also initializing the local ev-node instance +// SetupCommonEVMEnv creates and initializes ev-reth instances, while also initializing the local ev-node instance🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/evm_test_common.go` around lines 532 - 534, The doc comment currently starts with the old unexported name; update the comment to begin with the exported function name SetupCommonEVMEnv so it follows Go doc conventions (e.g., change the leading sentence from "setupCommonEVMEnv creates..." to "SetupCommonEVMEnv creates..."), leaving the rest of the description intact and ensuring the comment now documents the exported SetupCommonEVMEnv function.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/benchmark.yml:
- Around line 11-16: Update the PR path filter in the GitHub Actions workflow to
include the block executor benchmark sources so changes to that package trigger
the workflow; specifically add the path pattern 'block/internal/executing/**'
(or more broadly 'block/**' if preferred) to the existing paths array in
.github/workflows/benchmark.yml so the job that runs `go test
./block/internal/executing/...` will trigger on relevant PRs.
In `@test/e2e/benchmark/suite_test.go`:
- Around line 73-78: The OTLP env setup in the b.WithTag(...).WithEnv call is
wrong: change the value for "OTEL_EXPORTER_OTLP_ENDPOINT" to use the base ingest
URL (remove the "/v1/traces" suffix) so the SDK can append per-signal paths,
keep "OTEL_EXPORTER_OTLP_TRACES_ENDPOINT" as-is (full URL), and update
"OTEL_EXPORTER_OTLP_PROTOCOL" and "OTEL_EXPORTER_OTLP_TRACES_PROTOCOL" from
"http" to the spec-correct "http/protobuf" (or another valid spec value like
"grpc" or "http/json" if intended).
---
Duplicate comments:
In `@test/e2e/benchmark/spamoor_smoke_test.go`:
- Around line 20-23: Replace the hardcoded rethTag value ("pr-140") in the test
setup by reading it from configuration (environment variable or test CLI flag)
so tests don't break when image tags change: update the call site where
s.setupEnv(config{ rethTag: "pr-140", ... }) is used in spamoor_smoke_test.go to
instead pull the tag from an env var (eg. os.Getenv("RETH_TAG")) or a provided
flag, and adjust the config struct and any test bootstrap that constructs config
to accept and pass that dynamic value (look for the config type, its rethTag
field, and the setupEnv function to make the change).
---
Nitpick comments:
In `@test/e2e/benchmark/metrics.go`:
- Around line 15-34: The requireHTTP function currently blocks and can't be
cancelled; change its signature to accept a context.Context (e.g.,
requireHTTP(ctx context.Context, t testing.TB, url string, timeout
time.Duration) or use t.Context() inside) and use that context for cancellation
inside the polling loop: create requests with http.NewRequestWithContext or
attach ctx to the client.Do call, replace time.Sleep with a select on ctx.Done()
and time.After(100*time.Millisecond), check ctx.Err() and call t.Fatalf when the
context is cancelled, and ensure you still honor the timeout logic and preserve
the lastErr handling in the loop.
In `@test/e2e/benchmark/output.go`:
- Around line 67-76: In resultWriter.flush, avoid using require.NoError (which
calls FailNow/Goexit when deferred) for the json.MarshalIndent and os.WriteFile
errors; instead check the error values from json.MarshalIndent and os.WriteFile
and log them non-fatally (e.g., use w.t.Errorf or w.t.Logf with descriptive
messages referencing outputPath and the error) so flush reports failures without
invoking runtime.Goexit from a deferred context; update the calls that currently
use require.NoError(w.t, ...) to explicit if err != nil { w.t.Errorf("...: %v",
err); return } or similar logging behavior for both the marshal and write steps
in resultWriter.flush.
In `@test/e2e/benchmark/spamoor_smoke_test.go`:
- Around line 68-70: Replace the fixed time.Sleep(60 * time.Second) wait with a
polling loop that queries the Jaeger API until the expected traces/spans from
spamoor appear or a context deadline elapses; implement this by replacing the
direct call to time.Sleep with a function (e.g., waitForTraces(ctx, serviceName,
expectedSpanCount)) that uses a context with timeout, exponential/backoff
retries, and HTTP calls to the Jaeger query endpoint, returning early when
traces are observed and failing the test only if the timeout is reached—update
the test to call waitForTraces instead of time.Sleep and use the existing
service identifier used in the test to locate traces.
In `@test/e2e/benchmark/traces.go`:
- Around line 23-47: In extractSpansFromTraces, rename the range variable t to a
clearer name (e.g., traceEntry or rawTrace) to avoid shadowing the conventional
testing variable t; update all uses of t in that function (the outer loop in
extractSpansFromTraces and its casts to traceMap) so the logic remains unchanged
but the variable is more descriptive and less confusing to readers familiar with
testing.TB.
In `@test/e2e/evm_test_common.go`:
- Around line 532-534: The doc comment currently starts with the old unexported
name; update the comment to begin with the exported function name
SetupCommonEVMEnv so it follows Go doc conventions (e.g., change the leading
sentence from "setupCommonEVMEnv creates..." to "SetupCommonEVMEnv creates..."),
leaving the rest of the description intact and ensuring the comment now
documents the exported SetupCommonEVMEnv function.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
.github/workflows/benchmark.ymltest/e2e/benchmark/metrics.gotest/e2e/benchmark/output.gotest/e2e/benchmark/spamoor_smoke_test.gotest/e2e/benchmark/suite_test.gotest/e2e/benchmark/traces.gotest/e2e/evm_contract_bench_test.gotest/e2e/evm_force_inclusion_e2e_test.gotest/e2e/evm_full_node_e2e_test.gotest/e2e/evm_spamoor_smoke_test.gotest/e2e/evm_test_common.gotest/e2e/failover_e2e_test.go
💤 Files with no reviewable changes (1)
- test/e2e/evm_spamoor_smoke_test.go
go test treats all arguments after an unknown flag (--evm-binary) as test binary args, so ./benchmark/ was never recognized as a package pattern.
go test sets the cwd to the package directory (test/e2e/benchmark/), so the binary path needs an extra parent traversal.
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Spamoor Trace Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.50.
| Benchmark suite | Current: 8752fee | Previous: f73a124 | Ratio |
|---|---|---|---|
SpamoorSmoke - Executor.SetFinal (avg) |
2078.176646706587 us |
1257 us |
1.65 |
This comment was automatically generated by workflow using github-action-benchmark.
The benchmark package doesn't define the --binary flag that test-e2e passes. It has its own CI workflow so it doesn't need to run here.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3107 +/- ##
==========================================
+ Coverage 60.90% 60.95% +0.05%
==========================================
Files 113 113
Lines 11617 11617
==========================================
+ Hits 7075 7081 +6
+ Misses 3743 3737 -6
Partials 799 799
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
test/e2e/benchmark/metrics.go
Outdated
| client := &http.Client{Timeout: 200 * time.Millisecond} | ||
| deadline := time.Now().Add(timeout) | ||
| var lastErr error | ||
| for time.Now().Before(deadline) { |
There was a problem hiding this comment.
personal preference: require.Eventually()
test/e2e/benchmark/metrics.go
Outdated
| ) | ||
|
|
||
| // requireHTTP polls a URL until it returns a 2xx status code or the timeout expires. | ||
| func requireHTTP(t testing.TB, url string, timeout time.Duration) { |
There was a problem hiding this comment.
nit: naming is hard. how about requireHostUp
- make reth tag configurable via EV_RETH_TAG env var (default pr-140) - fix OTLP config: remove duplicate env vars, use http/protobuf protocol - use require.Eventually for host readiness polling - rename requireHTTP to requireHostUp - use non-fatal logging in resultWriter.flush deferred context - fix stale doc comment (setupCommonEVMEnv -> SetupCommonEVMEnv) - rename loop variable to avoid shadowing testing.TB convention - add block/internal/executing/** to CI path trigger - remove unused require import from output.go
# Conflicts: # scripts/test.mk
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/e2e/evm_test_common.go (1)
306-318: Align the exported function comment withSetupSequencerNode.Line 306 still starts with
setupSequencerNodewhile the exported symbol at Line 318 isSetupSequencerNode. Please update the comment prefix to match the exported name for doc/lint consistency.As per coding guidelines:
**/*.go: Document exported types and functions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/evm_test_common.go` around lines 306 - 318, The function doc comment currently starts with the unexported name `setupSequencerNode` but the exported function is `SetupSequencerNode`; update the comment's first line to begin with `SetupSequencerNode` so the doc comment matches the exported symbol and satisfies the Go linter rules for exported functions (refer to the comment block above the `SetupSequencerNode` function).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/e2e/evm_test_common.go`:
- Around line 306-318: The function doc comment currently starts with the
unexported name `setupSequencerNode` but the exported function is
`SetupSequencerNode`; update the comment's first line to begin with
`SetupSequencerNode` so the doc comment matches the exported symbol and
satisfies the Go linter rules for exported functions (refer to the comment block
above the `SetupSequencerNode` function).
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.github/workflows/benchmark.ymltest/e2e/benchmark/metrics.gotest/e2e/benchmark/output.gotest/e2e/benchmark/spamoor_smoke_test.gotest/e2e/benchmark/suite_test.gotest/e2e/benchmark/traces.gotest/e2e/evm_contract_bench_test.gotest/e2e/evm_test_common.go
🚧 Files skipped from review as they are similar to previous changes (3)
- test/e2e/benchmark/spamoor_smoke_test.go
- test/e2e/benchmark/suite_test.go
- test/e2e/benchmark/metrics.go
Overview
Introduces a benchmarking package within the e2e dir a testsuite we can add additional spamoor tests too.
Summary by CodeRabbit
Tests
Refactor
Chores