feat: auto-detect Engine API GetPayload version for Osaka fork by tac0turtle · Pull Request #3113 · evstack/ev-node

tac0turtle · 2026-02-27T13:18:38Z

GetPayload now automatically selects between engine_getPayloadV4 (Prague) and engine_getPayloadV5 (Osaka) by caching the last successful version and retrying with the alternative on "Unsupported fork" errors (code -38005).

This handles Prague chains, Osaka-at-genesis chains, and time-based Prague-to-Osaka upgrades with zero configuration. At most one extra RPC call occurs at the fork transition point.

Overview

Summary by CodeRabbit

New Features
- Automatic fallback between payload API versions with caching to prefer the successful version.
- Exposes the currently used payload method for improved tracing and diagnostics.
- Enhanced error messages that indicate which payload method and payload ID failed, improving observability.
Tests
- Added comprehensive tests validating version selection, fallback behavior, caching, fork upgrade scenarios, and error propagation.

GetPayload now automatically selects between engine_getPayloadV4 (Prague) and engine_getPayloadV5 (Osaka) by caching the last successful version and retrying with the alternative on "Unsupported fork" errors (code -38005). This handles Prague chains, Osaka-at-genesis chains, and time-based Prague-to-Osaka upgrades with zero configuration. At most one extra RPC call occurs at the fork transition point. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

claude · 2026-02-27T13:18:52Z

Claude finished @tac0turtle's task in 2m 3s —— View job

Code Review: Auto-detect Engine API GetPayload version for Osaka fork

Gather context and understand the request
Read changed files
Review code quality & architecture
Review error handling & resilience
Review security considerations
Review performance & resource efficiency
Review testing & reliability
Post final review feedback

Summary

This is a well-scoped, clean implementation. The fallback-and-cache pattern is correct, the atomic flag provides safe lock-free coordination, and the test coverage is thorough. A few observations below.

Architecture & Code Quality

engine_rpc_client.go

The implementation is minimal and correct. The atomic.Bool for useV5 is the right primitive here — lock-free, zero-cost on the hot path once the fork is resolved.

One subtle point: the expression e.useV5.Store(altMethod == getPayloadV5Method) correctly derives the stored value from the local variable altMethod (not from a re-read of useV5). This avoids a potential TOCTOU issue under concurrent calls. Good.

Concurrent fork transition safety: Two simultaneous calls during the transition (both reading useV5=false) will each try V4, each fall back to V5, and both store useV5=true. Result: correct, at most N×2 RPCs for N concurrent callers at the exact fork boundary. Acceptable trade-off given fork transitions are rare.

One-way transition assumption: useV5 only transitions false → true, never in reverse. This is valid for time-based Ethereum forks which don't downgrade. The code relies on this implicit assumption — a brief comment near the useV5 declaration would make this explicit for future readers.

engine_rpc_tracing.go

The payloadMethodGetter interface + type assertion pattern is clean and avoids adding an optional method to the EngineRPCClient interface. One edge case: if tracedEngineRPCClient itself were ever wrapped by another tracedEngineRPCClient (double-wrapping), the inner assertion t.inner.(payloadMethodGetter) would fail (since tracedEngineRPCClient doesn't implement GetPayloadMethod()), and the span would silently omit the method attribute. This is unlikely in the current wiring (execution.go:255) but worth noting.

Error Handling

The error messages are well-formed and include method name and payload ID for diagnosability:

engine_getPayloadV4 payload <id>: <err> for non-fork errors
engine_getPayloadV5 fallback after engine_getPayloadV4 unsupported fork, payload <id>: <err> for failed fallbacks

One subtle observability gap: when the fallback succeeds (V4→V5 transition call), the tracing span records method=engine_getPayloadV5 — the winning method — without indicating that V4 was attempted first. An operator investigating a slightly slower call won't see in the span that two RPCs were made. Consider adding a "fallback": true attribute Fix this → on the span when altMethod was used.

Testing

The tests in engine_rpc_client_test.go are comprehensive and cover the critical scenarios:

Prague (V4-only) chain with caching
Osaka-at-genesis (V4→V5 fallback + caching)
Time-based Prague→Osaka upgrade with three-phase assertion (V4 works, transition, V5 cached)
Non-fork error propagation
isUnsupportedForkErr table-driven test including wrapped error

One gap: there is no direct test asserting GetPayloadMethod() returns the correct value before/after the transition. While this is tested implicitly through the call sequence assertions, an explicit test would guard against GetPayloadMethod() drifting from the actual used method. Fix this →

Minor: the fakeEngineServer has an internal mu that holds the lock for the entire HTTP handler, while test closures also lock their own mu. The nesting order is always fakeEngineServer.mu → test.mu, so no deadlock risk, but it's worth being aware of when adding new tests.

Performance

atomic.Bool load/store is O(1) and non-blocking ✓
Constants avoid string allocation on every call ✓
At most 1 extra RPC per node lifetime (at fork boundary) ✓
No heap allocations on the success path beyond what CallContext already does ✓

Minor Nits

engine_rpc_client.go:51-56 — The method/altMethod swap could include a brief comment explaining this is symmetric: either direction (V4→V5 or V5→V4) uses the same fallback path, covering the theoretical edge case of Osaka-to-Prague downgrade (e.g., reorg recovery). Not strictly necessary but improves understanding.
engine_rpc_tracing.go:83-85 — The method attribute is set even on error. For the case where both methods fail, useV5 is not updated, so the span will show the pre-failure preference. This is technically correct but might be non-obvious to readers. A comment clarifying this behavior would help.
NewPayload and ForkchoiceUpdated remain hardcoded to V4 and V3 respectively. If Osaka introduces new versions of these methods, they'll need similar treatment. Not in scope for this PR, but the pattern established here makes it straightforward to extend.

Overall Assessment

The implementation is correct, thread-safe, and well-tested. The approach of using an atomic.Bool with retry-on-unsupported-fork is idiomatic and keeps the complexity minimal. The issues noted above are minor and don't block merging.

github-actions · 2026-02-27T13:18:54Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Mar 2, 2026, 11:30 AM

coderabbitai · 2026-02-27T13:19:03Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 954b703 and 4f7a617.

📒 Files selected for processing (1)

execution/evm/engine_rpc_tracing.go

🚧 Files skipped from review as they are similar to previous changes (1)

execution/evm/engine_rpc_tracing.go

📝 Walkthrough

Walkthrough

Adds negotiated Engine API selection to the Engine RPC client: it prefers a cached method (V4 or V5), calls that method, falls back on Engine "Unsupported fork" (-38005) errors to the alternate method, caches the successful method, and exposes the resolved method via GetPayloadMethod for tracing.

Changes

Cohort / File(s)	Summary
Engine RPC Client Implementation `execution/evm/engine_rpc_client.go`	Adds `engineErrUnsupportedFork` (-38005), `getPayloadV4Method`/`getPayloadV5Method` constants, `isUnsupportedForkErr` helper, a `useV5` `atomic.Bool` flag, fallback-and-caching logic in `GetPayload`, and `GetPayloadMethod()` to report the resolved method.
Engine RPC Client Tests `execution/evm/engine_rpc_client_test.go`	New tests with a fake JSON-RPC server covering V4/V5 calls, unsupported-fork fallback, caching behavior across upgrades, error propagation, and `isUnsupportedForkErr` cases.
Tracing Integration `execution/evm/engine_rpc_tracing.go`	Adds `payloadMethodGetter` interface (GetPayloadMethod() string) and updates tracing to read the actual resolved method from the client instead of using a hardcoded method name.

Sequence Diagram

sequenceDiagram
    participant Client
    participant EngineRPCClient
    participant Engine_V4 as Engine API V4
    participant Engine_V5 as Engine API V5

    Client->>EngineRPCClient: GetPayload(id)
    alt useV5 == false (or unset)
        EngineRPCClient->>Engine_V4: engine_getPayloadV4(id)
        Engine_V4-->>EngineRPCClient: error (-38005) "Unsupported fork"
        EngineRPCClient->>Engine_V5: engine_getPayloadV5(id)
        Engine_V5-->>EngineRPCClient: Payload
        EngineRPCClient->>EngineRPCClient: set useV5 = true
    else useV5 == true
        EngineRPCClient->>Engine_V5: engine_getPayloadV5(id)
        Engine_V5-->>EngineRPCClient: Payload
    end
    EngineRPCClient-->>Client: Payload or wrapped error (includes method + id)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 I hopped between V4 and V5 today,

When forks said "no," I found another way.
I cached the route and stayed upbeat —
Now payloads land right on my little feet. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: introducing automatic version detection for the Engine API GetPayload method to handle Osaka fork transitions.
Description check	✅ Passed	The description provides a clear overview of the feature and how it works, but lacks link to any related issue (e.g., 'Closes #').

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/auto-detect-osaka-getpayload-version

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-02-27T13:21:57Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.95%. Comparing base (f73a124) to head (4f7a617).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3113      +/-   ##
==========================================
+ Coverage   60.90%   60.95%   +0.05%     
==========================================
  Files         113      113              
  Lines       11617    11617              
==========================================
+ Hits         7075     7081       +6     
+ Misses       3743     3738       -5     
+ Partials      799      798       -1

Flag	Coverage Δ
combined	`60.95% <ø> (+0.05%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

execution/evm/engine_rpc_client.go (1)

52-64: Wrap RPC errors with call context to improve diagnosability.

The current returns lose method-level context when failures bubble up. Wrapping here will make operational debugging much easier.

♻️ Proposed patch

 import (
 	"context"
 	"errors"
+	"fmt"
 	"sync/atomic"
@@
 	err := e.client.CallContext(ctx, &result, method, payloadID)
 	if err == nil {
 		return &result, nil
 	}
 
 	if !isUnsupportedForkErr(err) {
-		return nil, err
+		return nil, fmt.Errorf("%s failed for payload %x: %w", method, payloadID, err)
 	}
@@
 	err = e.client.CallContext(ctx, &result, altMethod, payloadID)
 	if err != nil {
-		return nil, err
+		return nil, fmt.Errorf("%s failed after %s unsupported fork for payload %x: %w", altMethod, method, payloadID, err)
 	}

As per coding guidelines "Wrap errors with context using fmt.Errorf".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@execution/evm/engine_rpc_client.go` around lines 52 - 64, The RPC CallContext
error returns currently lose method context; when returning errors from the
calls to e.client.CallContext (both the primary call using method and the
fallback using altMethod) wrap the original error using fmt.Errorf to include
the method name and payloadID (e.g. fmt.Errorf("engine rpc %s payload %v: %w",
method, payloadID, err)), and do the same for the fallback call (include
altMethod in the message); update returns in the block around
isUnsupportedForkErr(err) and the subsequent fallback so errors preserve call
context for diagnosability.

execution/evm/engine_rpc_client_test.go (1)

96-215: Consider table-driving the GetPayload scenario tests.

These three scenario tests share setup/teardown and method-tracking logic; table-driven structure would reduce duplication and make new fork-transition cases easier to add.

As per coding guidelines "Use table-driven tests where appropriate".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@execution/evm/engine_rpc_client_test.go` around lines 96 - 215, The three
tests TestGetPayload_PragueChain_UsesV4, TestGetPayload_OsakaChain_FallsBackToV5
and TestGetPayload_ForkUpgrade_SwitchesV4ToV5 duplicate setup/teardown and
call-tracking; refactor them into a single table-driven test that iterates over
cases (name, server behavior closure for fakeEngineServer, expected call
sequences before/after cache or upgrade, and any state transitions like toggling
osakaActive), reuse NewEngineRPCClient and client.GetPayload(ctx,
engine.PayloadID{}) in a loop, and assert calledMethods for each subcase; keep
unique identifiers (fakeEngineServer, NewEngineRPCClient, GetPayload,
calledMethods, osakaActive) so each case can provide the server callback and
expected sequences, reducing duplication and making future fork scenarios easier
to add.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@execution/evm/engine_rpc_client_test.go`:
- Around line 172-215: The test TestGetPayload_ForkUpgrade_SwitchesV4ToV5
currently only asserts success but not that the client cached the switch to V5;
instrument the fakeEngineServer handler with counters (e.g., v4Calls, v5Calls)
or a boolean flag to record how many times engine_getPayloadV4 and
engine_getPayloadV5 were invoked, call client.GetPayload as already written,
then after the first post-upgrade call assert that v4Calls == 1 and v5Calls >= 1
(first fallback used V5), and after the subsequent call assert v4Calls remains 1
and v5Calls increased (or v4Calls did not increase) to prove the client cached
and directly used V5; reference the test function
TestGetPayload_ForkUpgrade_SwitchesV4ToV5, the fakeEngineServer handler, and
client.GetPayload when adding these assertions.

---

Nitpick comments:
In `@execution/evm/engine_rpc_client_test.go`:
- Around line 96-215: The three tests TestGetPayload_PragueChain_UsesV4,
TestGetPayload_OsakaChain_FallsBackToV5 and
TestGetPayload_ForkUpgrade_SwitchesV4ToV5 duplicate setup/teardown and
call-tracking; refactor them into a single table-driven test that iterates over
cases (name, server behavior closure for fakeEngineServer, expected call
sequences before/after cache or upgrade, and any state transitions like toggling
osakaActive), reuse NewEngineRPCClient and client.GetPayload(ctx,
engine.PayloadID{}) in a loop, and assert calledMethods for each subcase; keep
unique identifiers (fakeEngineServer, NewEngineRPCClient, GetPayload,
calledMethods, osakaActive) so each case can provide the server callback and
expected sequences, reducing duplication and making future fork scenarios easier
to add.

In `@execution/evm/engine_rpc_client.go`:
- Around line 52-64: The RPC CallContext error returns currently lose method
context; when returning errors from the calls to e.client.CallContext (both the
primary call using method and the fallback using altMethod) wrap the original
error using fmt.Errorf to include the method name and payloadID (e.g.
fmt.Errorf("engine rpc %s payload %v: %w", method, payloadID, err)), and do the
same for the fallback call (include altMethod in the message); update returns in
the block around isUnsupportedForkErr(err) and the subsequent fallback so errors
preserve call context for diagnosability.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c449847 and ce75825.

📒 Files selected for processing (2)

execution/evm/engine_rpc_client.go
execution/evm/engine_rpc_client_test.go

execution/evm/engine_rpc_client_test.go

github-actions

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Spamoor Trace Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.50.

Benchmark suite	Current: `954b703`	Previous: `c449847`	Ratio
`SpamoorSmoke - BlockExecutor.CreateBlock (avg)`	`4.144171779141105` us	`2.4564564564564564` us	`1.69`

This comment was automatically generated by workflow using github-action-benchmark.

chatton

Looks mostly good, I think we need to update it to make the actual method that is configured to be called exposed in some way so that the tracing wrapper reports the correct method call in the span.

execution/evm/engine_rpc_client.go

coderabbitai

🧹 Nitpick comments (1)

execution/evm/engine_rpc_client.go (1)
50-77: Minor observability gap on failed fallback attempts.

When the fallback method also fails (line 71), GetPayloadMethod() will return the primary method name (since useV5 isn't updated on failure), but the trace won't reflect that both methods were attempted. This could be confusing when debugging failed fork transitions.

Consider: Is the detailed error message (which includes both method names) sufficient for debugging, or should the traced method also indicate a fallback was attempted?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@execution/evm/engine_rpc_client.go` around lines 50 - 77, GetPayload
currently tries a primary RPC method then falls back to an alternate but does
not record that a fallback was attempted when the fallback also fails; update
engineRPCClient.GetPayload to record the fallback attempt in observability
(e.g., attach a span tag/log field) before returning the final error so traces
reflect both getPayloadV4Method and getPayloadV5Method were tried; reference
useV5, getPayloadV4Method/getPayloadV5Method and the error branch where the
altMethod CallContext fails to add the trace/log annotation indicating a
fallback was attempted (do not change useV5 on failure).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@execution/evm/engine_rpc_client.go`:
- Around line 50-77: GetPayload currently tries a primary RPC method then falls
back to an alternate but does not record that a fallback was attempted when the
fallback also fails; update engineRPCClient.GetPayload to record the fallback
attempt in observability (e.g., attach a span tag/log field) before returning
the final error so traces reflect both getPayloadV4Method and getPayloadV5Method
were tried; reference useV5, getPayloadV4Method/getPayloadV5Method and the error
branch where the altMethod CallContext fails to add the trace/log annotation
indicating a fallback was attempted (do not change useV5 on failure).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ce75825 and 954b703.

📒 Files selected for processing (3)

execution/evm/engine_rpc_client.go
execution/evm/engine_rpc_client_test.go
execution/evm/engine_rpc_tracing.go

🚧 Files skipped from review as they are similar to previous changes (1)

execution/evm/engine_rpc_client_test.go

chatton

LGTM, one nit on naming 👍

execution/evm/engine_rpc_tracing.go

github-actions bot assigned tac0turtle Feb 27, 2026

coderabbitai bot reviewed Feb 27, 2026

View reviewed changes

execution/evm/engine_rpc_client_test.go Show resolved Hide resolved

github-actions bot reviewed Feb 27, 2026

View reviewed changes

tac0turtle requested review from chatton February 27, 2026 13:43

chatton reviewed Mar 2, 2026

View reviewed changes

execution/evm/engine_rpc_client.go Outdated Show resolved Hide resolved

execution/evm/engine_rpc_client.go Outdated Show resolved Hide resolved

fix comments

954b703

coderabbitai bot reviewed Mar 2, 2026

View reviewed changes

chatton approved these changes Mar 2, 2026

View reviewed changes

execution/evm/engine_rpc_tracing.go Outdated Show resolved Hide resolved

tac0turtle added 2 commits March 2, 2026 12:29

rename

06c44b5

Merge branch 'main' into feat/auto-detect-osaka-getpayload-version

4f7a617

tac0turtle enabled auto-merge March 2, 2026 11:29

tac0turtle added this pull request to the merge queue Mar 2, 2026

Merged via the queue into main with commit 805f927 Mar 2, 2026
33 checks passed

tac0turtle deleted the feat/auto-detect-osaka-getpayload-version branch March 2, 2026 12:03

Conversation

tac0turtle commented Feb 27, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Summary by CodeRabbit

Uh oh!

claude bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: Auto-detect Engine API GetPayload version for Osaka fork

Summary

Architecture & Code Quality

Error Handling

Testing

Performance

Minor Nits

Overall Assessment

Uh oh!

github-actions bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

codecov bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

⚠️ Performance Alert ⚠️

Uh oh!

chatton left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

chatton left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tac0turtle commented Feb 27, 2026 •

edited by coderabbitai bot

Loading

claude bot commented Feb 27, 2026 •

edited

Loading

github-actions bot commented Feb 27, 2026 •

edited

Loading

coderabbitai bot commented Feb 27, 2026 •

edited

Loading

codecov bot commented Feb 27, 2026 •

edited

Loading

github-actions bot left a comment •

edited

Loading