Skip to content

(feat): tracing support for state-sync transactions#2172

Open
manav2401 wants to merge 28 commits into
developfrom
state-sync-tracing
Open

(feat): tracing support for state-sync transactions#2172
manav2401 wants to merge 28 commits into
developfrom
state-sync-tracing

Conversation

@manav2401
Copy link
Copy Markdown
Member

@manav2401 manav2401 commented Mar 31, 2026

Description

Extension of #2062 from @milando12.

This PR

  • Refactors almost all tracing methods as state-sync transactions weren't handled correctly.
    • This involves major simplification of mostly all tracing methods (e.g. debug_traceBlockByNumber, debug_traceBlockByHash, debug_traceChain, etc).
    • It mimics the actual production environment for tracing state-sync events which wasn't the case earlier (and hence used to give incorrect results).
  • Adds live tracing support for state-sync transactions. This means if someone has live tracing enabled while running a node, the tracing will also cover the state-sync transactions which contains the bridge events.
  • Adds new custom hooks for tracing state-sync transactions which wraps the inner tracer's hooks. More explanation on this below.
  • Adds e2e tests to actually test a state-sync transaction trace via state receiver contract instead of using mock data.

The reason why we add custom hooks is because state-sync transactions can contain multiple bridge events and all of them are executed as separate EVM calls which are packed into 1 transaction. This breaks assumptions for certain tracer e.g. in callTracer, it should only have 1 root call frame per transaction. This would be invalid for state-sync transactions with multiple state-sync events (because it would have multiple independent root call frames). Hence, referencing the erigon implementation, this PR adds wrapper hooks which crafts a single synthetic root call frame and all bridge event calls are nested sub-calls inside it.

Changes

  • Bugfix (non-breaking change that solves an issue)
  • Hotfix (change that solves an urgent issue, and requires immediate attention)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (change that is not backwards-compatible and/or changes current functionality)
  • Changes only for a subset of nodes

Breaking changes

Please complete this section if any breaking changes have been made, otherwise delete it

Nodes audience

In case this PR includes changes that must be applied only to a subset of nodes, please specify how you handled it (e.g. by adding a flag with a default value...)

Checklist

  • I have added at least 2 reviewer or the whole pos-v1 team
  • I have added sufficient documentation in code
  • I will be resolving comments - if any - by pushing each fix in a separate commit and linking the commit hash in the comment reply
  • Created a task in Jira and informed the team for implementation in Erigon client (if applicable)
  • Includes RPC methods changes, and the Notion documentation has been updated

Cross repository changes

  • This PR requires changes to heimdall
    • In case link the PR here:
  • This PR requires changes to matic-cli
    • In case link the PR here:

Testing

  • I have added unit tests
  • I have added tests to CI
  • I have tested this code manually on local environment
  • I have tested this code manually on remote devnet using express-cli
  • I have tested this code manually on amoy
  • I have created new e2e tests into express-cli

Manual tests

Please complete this section with the steps you performed if you ran manual tests for this functionality, otherwise delete it

Additional comments

Please post additional comments in this section if you have them, otherwise delete it

Add live tracing and refactor tracing methods for state-sync transactions.

Co-authored-by: Milan Došlić <milan.doslic@tenderly.co>
Copilot AI review requested due to automatic review settings March 31, 2026 10:26
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@manav2401
Copy link
Copy Markdown
Member Author

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Mar 31, 2026

Claude finished @manav2401's task in 3m 51s —— View job


Code Review: State-Sync Tracing Support

  • Gather context and read changed files
  • Review core tracing changes (eth/tracers/api.go)
  • Review state-sync processor changes (consensus/bor/statefull/processor.go)
  • Review transaction type changes (core/types/transaction.go)
  • Review state processor changes (core/state_processor.go, core/parallel_state_processor.go)
  • Review test changes
  • Review remaining files
  • Post final review

Summary

Good PR overall. The refactoring consolidates state-sync tracing into the standard tracing codepaths (removing the separate api_bor.go and TraceBorBlock endpoint), adds live tracing support via vmConfig propagation through the consensus engine, and uses real StateReceiver contract bytecode in e2e tests. The net deletion of ~700 lines is a positive sign.

Below are the findings, organized by severity.


Issues

1. [HIGH] OnTxEnd called even when no state-sync receipt was added

File: core/state_processor.go:179-181 and core/parallel_state_processor.go:451-453

if hooks := evm.Config.Tracer; hooks != nil && hooks.OnTxEnd != nil {
    hooks.OnTxEnd(receipts[len(receipts)-1], nil)
}

OnTxEnd is called unconditionally after the Madhugiri check, regardless of whether appliedNewStateSyncReceipt is true. If no new state-sync receipt was added (i.e., appliedNewStateSyncReceipt == false), this will still call OnTxEnd with the last regular transaction's receipt, creating a mismatched OnTxStart/OnTxEnd pair. The OnTxStart at line 157 fires only when a state-sync tx is present, but OnTxEnd fires whenever IsMadhugiri is true.

Suggested fix: Guard OnTxEnd with the same condition:

if appliedNewStateSyncReceipt {
    allLogs = append(allLogs, receipts[len(receipts)-1].Logs...)
    if hooks := evm.Config.Tracer; hooks != nil && hooks.OnTxEnd != nil {
        hooks.OnTxEnd(receipts[len(receipts)-1], nil)
    }
}

Similarly, OnTxStart at line 156-159 should also be guarded — it fires if the last tx in the block is a state-sync tx, but that doesn't guarantee Finalize will actually produce a receipt for it. Consider moving both hooks to be conditional on receipt production.

Fix this →


2. [MEDIUM] Duplicate import in consensus/bor/heimdall/span/spanner.go

File: consensus/bor/heimdall/span/spanner.go:13-14

"github.com/ethereum/go-ethereum/consensus/bor/abi"
borabi "github.com/ethereum/go-ethereum/consensus/bor/abi"

The same package is imported twice — once unnamed and once as borabi. Only borabi is used (for borabi.SystemTxGas and borabi.ValidatorSet()). The unused abi import will cause a compile error unless the linter/compiler is lenient about it (it shouldn't be).

Note: Looking more carefully, the abi import was already there before the PR (for the abi.ValidatorSet() call), and the diff adds borabi as an alias. The fix should be to remove the unnamed import and keep only the aliased one, or just use abi.SystemTxGas without the alias.

Fix this →


3. [MEDIUM] traceChain doesn't handle state-sync transactions

File: eth/tracers/api.go:290-313

In traceChain, the inner loop iterates over task.block.Transactions() and calls traceTx for each tx. For state-sync transactions, traceTx calls ApplyStateSyncEvents which sets up a tx context via vmenv.SetTxContext(core.NewEVMTxContext(message)). However, the traceChain worker calls core.TransactionToMessage(tx, signer, ...) for every transaction including state-sync ones. TransactionToMessage uses the signer to recover the sender — for state-sync transactions (which have no signature), this may produce unexpected results or errors that are silently discarded by the _ assignment.

While traceTx internally handles state-sync via the StateSyncTxType branch, the message created by traceChain's outer loop at line 297 isn't meaningful for state-sync txs. This isn't a correctness bug per se (the message is passed through but traceTx overrides the behavior), but it's worth noting.

More importantly: traceBlockParallel (line 715) correctly skips state-sync txs in the parallel phase and traces them sequentially at the end (lines 807-824). But traceChain (line 290) processes all txs including state-sync in the parallel workers without this special handling. This means state-sync txs in traceChain are traced in the worker thread using a copied statedb, which may not have the correct accumulated state from prior transactions (since state-sync must be the last tx and depends on all prior state changes).


4. [MEDIUM] ApplyStateSyncEvents uses time.Now() for sync time

File: consensus/bor/statefull/processor.go:181

now := time.Now().Unix()

Using time.Now() as the syncTime parameter for commitState during tracing means:

  • Different tracing runs of the same block will produce different calldata (different timestamp).
  • This is non-deterministic and means trace results may not be reproducible.

For tracing (replay) purposes, consider using the block header's timestamp instead, which would be available from the EVM's block context. This would make traces deterministic and reproducible.

Fix this →


5. [LOW] GetSystemMessage uses params.MaxTxGas instead of block gas limit

File: consensus/bor/statefull/processor.go:77

Gas: params.MaxTxGas, // should be more than enough for state-sync related syscalls

MaxTxGas is used as the gas limit for the system message in ApplyStateSyncEvents. This is fine for production state-sync (where gas usage is bounded), but the comment acknowledges it's "should be more than enough." The original CommitState path via GenesisContractsClient also uses this value, so at least it's consistent.


6. [LOW] api_bor.go deletion removes TraceBorBlock and DebugBlock type

File: eth/tracers/api_bor.go (deleted), internal/cli/server/service.go:232

The TraceBorBlock RPC method, BlockTraceResult, TxTraceResult, and TraceBlockRequest types are removed. The DebugBlock gRPC handler is updated to use TraceBlockByNumber instead. This is a breaking change for any external callers of the debug_traceBorBlock RPC method. Make sure this is documented and communicated.


7. [LOW] Config reordering in DefaultConfig() changes TOML output order

File: internal/cli/server/config.go:810-1029

The fields in DefaultConfig() have been reordered and new defaults explicitly added (Ethstats, VMTrace, VMTraceJsonConfig, SyncMode, etc.). While functionally equivalent, this changes the output of make docs (the default_config.toml and example_config.toml diffs confirm this). No functional concern, but worth noting in case any tooling depends on field ordering.


Positive Observations

  1. Good refactoring: Removing the separate api_bor.go and consolidating state-sync handling into the standard tracing methods is the right approach. It reduces code duplication and ensures all tracing codepaths handle state-sync consistently.

  2. ABI extraction: Moving ABI definitions from consensus/bor/contract/ to consensus/bor/abi/ is a clean separation that allows the statefull package to use them without circular dependencies.

  3. Bounds check added: The new bounds check in eth/state_accessor.go:268-269 is a good defensive improvement.

  4. Strong test coverage: The e2e tests use real StateReceiver contract bytecode and verify actual EVM execution traces (LOG0 opcode counts, intermediate roots). This is much better than the previous mock-based approach.

  5. Live tracing support: Propagating vmConfig through Bor.SetVMConfigCommitState/CommitSpanstatefull.ApplyMessage is a clean way to enable live tracing for system transactions without major architectural changes.


Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Bor-specific tracing support so state-sync (bridge) transactions are included in VM/live traces, and refactors tracer codepaths to treat state-sync txs as first-class transactions (post-Madhugiri). Also updates consensus “system tx” execution to propagate VM tracer config and adds end-to-end tests that replay state-sync events via the StateReceiver contract.

Changes:

  • Extend tracing APIs to replay and trace post-Madhugiri StateSyncTx execution, including correct log indexing/cumulative context plumbing.
  • Propagate vmtrace VM config into Bor consensus system calls (span/state commits) so live tracing covers bridge processing.
  • Add E2E state-sync tracing tests using actual StateReceiver bytecode and update docs/config defaults for VM tracing.

Reviewed changes

Copilot reviewed 36 out of 37 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/bor/mocks/IHeimdallClient.go Regenerated mocks (mockgen metadata/any updates).
tests/bor/helper.go Update mock expectations for CommitSpan signature change.
tests/bor/bor_test.go Update mock expectations for CommitSpan signature change.
params/protocol_params.go Introduce params.BorSystemAddress for Bor system transactions.
internal/cli/server/service.go Switch DebugBlock tracing implementation to tracer API’s TraceBlockByNumber.
internal/cli/server/flags.go Clarify vmtrace flag usage text.
internal/cli/server/config.go Extend/reorder defaults (vmtrace/jsonconfig, rpc limits, sync/gc defaults, etc.).
eth/tracers/internal/tracetest/supply_test.go Update expected supply trace fields (issuance reward).
eth/tracers/dir.go Add Bor-specific context fields for state-sync tracing (cumulative gas/log index).
eth/tracers/api.go Major tracer refactor: state-sync tx replay/tracing, new traceTx return signature, removal of BorTraceEnabled paths.
eth/tracers/api_test.go Remove IO tracing dump test and related helpers.
eth/tracers/api_statesync_test.go Add E2E tests for tracing state-sync txs via StateReceiver + target contract.
eth/tracers/api_bor.go Remove Bor-only tracing API (TraceBorBlock) implementation.
eth/state_accessor.go Add bounds check for txIndex in stateAtTransaction.
eth/filters/IIterator.go Regenerated mocks (mockgen metadata/structure).
eth/filters/IDatabase.go Regenerated mocks + parameter naming cleanup/any updates.
eth/filters/IBatch.go Regenerated mocks + parameter naming cleanup/any updates.
eth/filters/IBackend.go Regenerated mocks + interface additions reflected in mock.
eth/ethconfig/config.go Use centralized Bor ABI package for validator set ABI.
eth/backend.go Pass VM tracer config into Bor engine for system tx tracing (SetVMConfig).
docs/cli/server.md Document updated vmtrace / vmtrace.jsonconfig defaults.
docs/cli/example_config.toml Add VM tracing example config entries.
docs/cli/default_config.toml Default vmtrace.jsonconfig to {} instead of empty string.
core/types/transaction.go Add Transaction.GetStateSyncData() helper for StateSyncTx.
core/state_processor.go Fire tracer hooks around state-sync tx processing and pass hooked state into Finalize.
core/parallel_state_processor.go Fire tracer hooks around state-sync tx processing in parallel processor.
consensus/bor/statefull/processor.go Add replay helper ApplyStateSyncEvents and plumb VM config into system calls.
consensus/bor/span.go Extend Spanner.CommitSpan signature to accept vm.Config.
consensus/bor/span_mock.go Regenerated mock for updated Spanner interface.
consensus/bor/heimdall/span/spanner.go Pass VM config through to system call execution; use centralized SystemTxGas constant.
consensus/bor/genesis.go Extend GenesisContract CommitState signature to accept vm.Config.
consensus/bor/genesis_contract_mock.go Regenerated mock for updated GenesisContract interface.
consensus/bor/contract/client.go Move ABI + SystemTxGas constants to consensus/bor/abi.
consensus/bor/bor.go Store VM config in Bor engine and use it for span/state system transactions.
consensus/bor/bor_test.go Update fakes/mocks for new CommitSpan / CommitState signatures.
consensus/bor/abi/common.go New centralized Bor ABI + SystemTxGas definitions.
cmd/keeper/go.sum Dependency checksum updates.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/cli/server/service.go Outdated
Comment thread consensus/bor/statefull/processor.go
Comment thread eth/tracers/api.go
Comment thread eth/tracers/api.go
Comment thread eth/tracers/api_statesync_test.go Outdated
@claude
Copy link
Copy Markdown

claude Bot commented Mar 31, 2026

Code Review

Found 3 issues:

1. Bug: Duplicate import causes compilation failure - consensus/bor/heimdall/span/spanner.go:13-14

The same package consensus/bor/abi is imported twice (once as abi, once as borabi). Go does not allow this. Fix: Remove the unaliased import and use borabi consistently.

2. Bug: OnTxEnd called without guard - panics on empty blocks - core/state_processor.go:176-181, core/parallel_state_processor.go:448-453

OnTxEnd is inside the IsMadhugiri block but NOT guarded by appliedNewStateSyncReceipt. On empty blocks, receipts[len(receipts)-1] panics with index out of range. On blocks without state-sync tx, OnTxEnd fires without matching OnTxStart. Fix: Move OnTxEnd inside the appliedNewStateSyncReceipt guard.

3. Bug: time.Now() produces non-deterministic traces - consensus/bor/statefull/processor.go:181

ApplyStateSyncEvents uses time.Now().Unix() as syncTime, but production path uses deterministic event.Time from Heimdall. Makes trace results non-reproducible. Root cause: StateSyncData lacks a Time field. Fix: Add Time field to StateSyncData and use it instead of time.Now().

Copilot AI review requested due to automatic review settings April 2, 2026 10:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 37 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread eth/tracers/api.go
Comment thread eth/tracers/api.go
Comment thread eth/tracers/api.go
Comment thread params/protocol_params.go
Comment thread eth/tracers/api_statesync_test.go Outdated
Comment thread consensus/bor/statefull/processor.go Outdated
Comment thread eth/tracers/api.go
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 31.80124% with 549 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.23%. Comparing base (f0ed88c) to head (107249a).

Files with missing lines Patch % Lines
eth/filters/IBackend.go 3.03% 256 Missing ⚠️
eth/tracers/api.go 60.00% 87 Missing and 11 partials ⚠️
eth/filters/IDatabase.go 0.00% 61 Missing ⚠️
consensus/bor/statefull/processor.go 0.00% 55 Missing ⚠️
eth/filters/IBatch.go 0.00% 16 Missing ⚠️
core/state_processor.go 40.00% 11 Missing and 1 partial ⚠️
tests/bor/mocks/IHeimdallClient.go 0.00% 10 Missing ⚠️
consensus/bor/abi/common.go 0.00% 9 Missing ⚠️
eth/backend.go 0.00% 9 Missing ⚠️
core/types/transaction.go 0.00% 5 Missing ⚠️
... and 5 more

❌ Your patch check has failed because the patch coverage (31.80%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2172      +/-   ##
===========================================
- Coverage    52.29%   52.23%   -0.07%     
===========================================
  Files          885      886       +1     
  Lines       155727   155726       -1     
===========================================
- Hits         81441    81339     -102     
- Misses       69055    69172     +117     
+ Partials      5231     5215      -16     
Files with missing lines Coverage Δ
core/parallel_state_processor.go 22.15% <100.00%> (+0.24%) ⬆️
core/types/bor_receipt.go 0.00% <ø> (ø)
eth/ethconfig/config.go 78.94% <100.00%> (ø)
eth/filters/IIterator.go 0.00% <ø> (ø)
eth/tracers/dir.go 76.47% <ø> (+23.52%) ⬆️
eth/tracers/state_sync_tracing_hooks.go 100.00% <100.00%> (ø)
internal/cli/server/config.go 63.93% <100.00%> (+0.05%) ⬆️
internal/cli/server/flags.go 100.00% <100.00%> (ø)
params/protocol_params.go 84.21% <ø> (ø)
tests/bor/helper.go 74.66% <100.00%> (+0.14%) ⬆️
... and 15 more

... and 19 files with indirect coverage changes

Files with missing lines Coverage Δ
core/parallel_state_processor.go 22.15% <100.00%> (+0.24%) ⬆️
core/types/bor_receipt.go 0.00% <ø> (ø)
eth/ethconfig/config.go 78.94% <100.00%> (ø)
eth/filters/IIterator.go 0.00% <ø> (ø)
eth/tracers/dir.go 76.47% <ø> (+23.52%) ⬆️
eth/tracers/state_sync_tracing_hooks.go 100.00% <100.00%> (ø)
internal/cli/server/config.go 63.93% <100.00%> (+0.05%) ⬆️
internal/cli/server/flags.go 100.00% <100.00%> (ø)
params/protocol_params.go 84.21% <ø> (ø)
tests/bor/helper.go 74.66% <100.00%> (+0.14%) ⬆️
... and 15 more

... and 19 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread core/state_processor.go
Comment thread eth/tracers/api.go Outdated
Comment thread eth/tracers/api.go
Copilot AI review requested due to automatic review settings April 2, 2026 19:54
@manav2401
Copy link
Copy Markdown
Member Author

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 7, 2026

Bug [eth/tracers/api.go:904] — Same missing nil guard on .Bor as above. api.backend.ChainConfig().Bor.StateReceiverContract is accessed without checking if Bor is nil in standardTraceBlockToFile, which is also RPC-exposed. Fix: same pattern — 'if borConfig := api.backend.ChainConfig().Bor; borConfig != nil { stateReceiverAddress = borConfig.StateReceiverContract }'.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 7, 2026

Bug [core/state_processor.go:159] — Unbalanced OnTxStart/OnTxEnd tracer lifecycle. OnTxStart fires unconditionally (line 157-159) when the last tx is StateSyncTxType, but OnTxEnd is only called when appliedNewStateSyncReceipt is true (line 176-180). If Finalize does not produce a receipt (stateSyncData is empty), the tx/receipt count mismatch at line 171 causes an error return without ever calling OnTxEnd. This leaves tracers like callTracer in an inconsistent state. Same issue exists in parallel_state_processor.go lines 425-453. Fix: move OnTxStart after Finalize succeeds, or call OnTxEnd before the error return at line 172.

Comment thread eth/tracers/state_sync_txn_tracer.go Outdated
Comment thread eth/tracers/api.go
@github-actions
Copy link
Copy Markdown

This PR is stale because it has been open 21 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions Bot added the Stale label Apr 29, 2026
@manav2401 manav2401 marked this pull request as draft May 8, 2026 13:09
@manav2401
Copy link
Copy Markdown
Member Author

@claude review this PR

@claude
Copy link
Copy Markdown

claude Bot commented May 13, 2026

Claude finished @manav2401's task in 4m 8s —— View job


Code Review: State-Sync Tracing Support

  • Gather context and read changed files
  • Review core tracing changes (eth/tracers/api.go)
  • Review state-sync processor (consensus/bor/statefull/processor.go)
  • Review state processor hooks (core/state_processor.go, core/parallel_state_processor.go)
  • Review state-sync tracing hooks (eth/tracers/state_sync_tracing_hooks.go)
  • Review remaining changes
  • Post final review

I reviewed the full PR (commit 131fd7f on top of 02f998a merge). This is a significant refactoring that consolidates state-sync tracing into standard codepaths, replaces the old stateSyncTxnTracer with a cleaner stateSyncHooks wrapper, and adds solid e2e tests using real StateReceiver bytecode. The net deletion of ~450 lines is a positive sign. Many issues from prior reviews have been addressed.

Below are the remaining findings.


Issues

1. [HIGH] statedb.Logs() map iteration makes log extraction non-deterministic

File: eth/tracers/api.go:1250-1254

allLogs := tracingStateDB.Logs()
sort.SliceStable(allLogs, func(i, j int) bool {
    return allLogs[i].Index < allLogs[j].Index
})
stateSyncLogs := allLogs[txctx.LogIndex:]

statedb.Logs() iterates over s.logs which is a map[common.Hash][]*types.Log (core/state/statedb.go:132). Go map iteration is non-deterministic — the order of log groups from different transactions is random on each call. The sort.SliceStable sorts by log.Index, but Index values restart from 0 for each transaction. Two logs from different transactions can have the same Index value (e.g., both are index 0 within their respective txs). SliceStable preserves relative order of equal elements, but when the input order is random (from map iteration), the output for equal-index elements is also random.

The subsequent allLogs[txctx.LogIndex:] slice relies on state-sync logs being at the end, which is not guaranteed.

Suggested fix: Use tracingStateDB.GetLogs(tx.Hash(), blockNumber, blockHash) which does a direct map lookup by tx hash and returns only that transaction's logs deterministically, eliminating the positional indexing assumption.

Fix this →


2. [HIGH] ApplyBorMessage always returns nil error — ApplyStateSyncEvents error check is dead code

File: consensus/bor/statefull/processor.go:214-217

result, err := ApplyBorMessage(vmenv, msg)
if err != nil {  // DEAD CODE — ApplyBorMessage always returns nil error
    return nil, fmt.Errorf("state-sync event %d execution failed: %w", event.ID, err)
}

ApplyBorMessage (lines 139-162) stores EVM errors (reverts, OOG) in result.Err but always returns nil as the Go error. The error guard at line 215 is unreachable. If a bridge event's commitState call reverts during tracing, the failure is silently swallowed — totalGasUsed still accumulates gas from the failed call, and the function returns Err: nil.

I understand the PR author's response that in production, state-sync events can silently fail and the network continues. However, for tracing accuracy, consider at minimum logging EVM failures or surfacing them in the trace output so users can identify which events failed during replay.


3. [MEDIUM] traceChain doesn't handle state-sync txs correctly

File: eth/tracers/api.go:290-315

Unlike traceBlockParallel (which correctly skips state-sync txs in the parallel phase and traces them sequentially at lines 808-825), the traceChain worker loop at line 297 iterates over task.block.Transactions() and processes all txs — including state-sync — in parallel worker threads using a copied statedb. The state-sync tx gets traced with a potentially stale state copy that may lack the effects of all prior transactions in the block. This is because the state copy happens at line 431 before the worker starts replaying transactions sequentially, and state-sync (always last) depends on all prior state changes being applied.

Note: The test TestTraceChain_WithStateSyncTx passes because the normal tx and state-sync tx in the test don't have conflicting state dependencies, but this is fragile.


4. [MEDIUM] Missing Bor != nil guard in IntermediateRoots and standardTraceBlockToFile

Files: eth/tracers/api.go:605, eth/tracers/api.go:904

if tx.Type() == types.StateSyncTxType {
    stateReceiverAddress := api.backend.ChainConfig().Bor.StateReceiverContract  // panics if Bor == nil

Two callsites access api.backend.ChainConfig().Bor.StateReceiverContract without checking Bor != nil. While traceTx at line 1201 correctly guards with api.backend.ChainConfig().Bor != nil, these paths don't. The author explained that finding a StateSyncTxType implies Bor is running, which is true for production canonical blocks. However, IntermediateRoots explicitly handles bad blocks (line 618 comment: "This should never happen while tracing canon blocks, only when tracing bad blocks"), where invariants may not hold.

Suggested fix: Add defensive Bor != nil check to match the existing pattern in traceTx:

if tx.Type() == types.StateSyncTxType && api.backend.ChainConfig().Bor != nil {

5. [MEDIUM] Dead fields Path and IOFlag in TraceConfig

File: eth/tracers/api.go:168-169

TraceConfig.Path and TraceConfig.IOFlag remain declared in the struct, but all IO tracing code that read them has been removed in this PR. Clients passing {"IOFlag": true, "Path": "/some/dir"} will have those fields silently deserialized and ignored. These should be removed or deprecated.


6. [LOW] ABI parse errors silently discarded

File: consensus/bor/abi/common.go:16-17

var (
    vABI, _ = abi.JSON(strings.NewReader(validatorsetABI))
    sABI, _ = abi.JSON(strings.NewReader(stateReceiverABI))
)

Parse errors are discarded. If the ABI strings are ever malformed (e.g., truncated during a refactor), the binary starts normally but all Pack/Unpack calls fail at runtime with opaque errors. Consider a mustParseABI helper that panics on init, following go-ethereum conventions. This was flagged in a prior review but has low practical risk since the ABI strings are compile-time constants.


7. [LOW] stateAtTransaction doesn't replay state-sync txs during index walk

File: eth/state_accessor.go:274-289

When TraceTransaction is called for a state-sync tx (always the last one in a block), stateAtTransaction replays all prior transactions using core.ApplyMessage. For intermediate StateSyncTxType transactions (which theoretically shouldn't exist — state-sync is always last), the replay would call ApplyMessage which may produce incorrect results. This is a non-issue in practice but worth noting for defensive completeness.


8. [LOW] Pre-existing resource leak in stateAtTransaction replay loop

File: eth/state_accessor.go:284-285

When core.ApplyMessage fails during the replay loop, the function returns an error without calling release(), leaking the trie reference acquired from stateAtBlock at line 255. This is a pre-existing issue (not introduced by this PR) but worth fixing since the PR already modified this function.


Positive Observations

  1. New stateSyncHooks wrapper is well-designed: The WrapStateSyncHooks approach (state_sync_tracing_hooks.go) is cleaner than the previous stateSyncTxnTracer. The depth-shifting strategy is sound — synthetic root at depth 0, all real EVM calls shifted to depth+1. The active boolean gated by OnTxStart/OnTxEnd is a clean way to selectively engage wrapping only for state-sync txs.

  2. Proper OnTxStart/OnTxEnd lifecycle in state_processor.go: The defer pattern for OnTxEnd at line 170-174 is a significant improvement over earlier iterations. It ensures OnTxEnd fires on all exit paths (including error returns from Finalize), addressing the unbalanced hook concern from prior reviews. The stateSyncEndErr variable correctly propagates the error to the deferred OnTxEnd call.

  3. Strong e2e test coverage: The tests use real StateReceiver contract bytecode, validate callTracer output structure (synthetic root frame, sub-calls, LOG0 opcode emissions), and test multiple tracers (callTracer, structLogger, IntermediateRoots). The validateStateSyncCallTrace helper is thorough.

  4. Clean ABI extraction: Moving ABI definitions to consensus/bor/abi/ is a good separation that avoids circular dependencies between statefull and contract packages.

  5. Live tracing integration: Propagating vmConfig through Bor.SetVMConfigCommitState/CommitSpanstatefull.ApplyMessage enables live tracing for system transactions without major architectural changes. The early wrapping in eth/backend.go:316-322 ensures the hooks are already state-sync-aware before they reach the consensus engine.

  6. Removal of api_bor.go: Consolidating the separate TraceBorBlock into the standard tracing codepath is the right approach. It reduces code duplication and ensures consistency.

  7. Deterministic time fix: Using vmenv.Context.Time instead of time.Now() for syncTime in ApplyStateSyncEvents (line 184) makes traces reproducible.


Comment thread core/state_processor.go
Comment thread eth/tracers/state_sync_tracing_hooks.go
Comment thread eth/tracers/api.go
@manav2401
Copy link
Copy Markdown
Member Author

@claude regarding your review findings.

  1. The current logic is correct. Log index doesn't overlap in any transaction of a single block. It keeps increasing with each transaction's log. E.g. if a block has 3 txs each having 2 logs - the log index would be 0 to 5. So I need to sort them as you said because it's a map but the last ones will always belong to state sync and there will be no overlap.
  2. ApplyStateSyncEvents should return nil irrespective of whether the actual system call pass or not. I'll probably add some logging to debug issues in tracer but ideally the revert is tracked in the trace and the call can still pass with a revert. I have added a warn log printing the error.
  3. The finding is incorrect - Parallelism in traceChain is across blocks, not within transactions of each block. This means there's no need of handling state db differently. The state sync will run at end of block and traceTx anyways handle it.
  4. Done, added nil guards.
  5. Done, removed unused variables.
  6. Done, added panic incase of errors loading ABI
  7. It's not needed - The stateAtTransaction function returns state which is before the tx at txIndex executes. This means if someone passes the txIndex of last tx, irrespective of that tx, it will never execute but will be returned before execution. Hence, we don't need special handling of state-sync.
  8. Done, fixed.

Review the PR again.

@claude
Copy link
Copy Markdown

claude Bot commented May 14, 2026

Claude finished @manav2401's task in 4m 58s —— View job


Code Review: State-Sync Tracing Support (Re-review)

  • Gather context and verify fixes from author's responses
  • Review core tracing changes (eth/tracers/api.go)
  • Review state-sync processor (consensus/bor/statefull/processor.go)
  • Review state processor hooks (core/state_processor.go, core/parallel_state_processor.go)
  • Review state-sync tracing hooks (eth/tracers/state_sync_tracing_hooks.go)
  • Review ABI package (consensus/bor/abi/common.go)
  • Review remaining changes and tests
  • Post final review

I reviewed the full PR at commit 8ca9043 (HEAD of state-sync-tracing). All fixes from the author's responses have been verified and the major issues from prior reviews are addressed. The code is in good shape.

Acknowledgment of Author's Responses

  1. Log index sorting — Author is correct. statedb.logSize monotonically increases across all transactions in a block (log.Index = s.logSize; s.logSize++ in statedb.go:558-560). Log indices are globally unique within a block, so sort.SliceStable by Index produces a deterministic ordering regardless of map iteration order. Finding withdrawn.

  2. ApplyStateSyncEvents error handling — Reasonable. The warn log (processor.go:219) is the right approach — production semantics allow silent reverts, and the trace itself will show the revert via OnExit(reverted=true).

  3. traceChain parallelism — Author is correct. I re-read the code: parallelism in traceChain is at the block level (each worker gets a full blockTraceTask), and within each worker, transactions are processed sequentially (for i, tx := range task.block.Transactions()). traceTx handles state-sync internally. Finding withdrawn.

  4. Nil guards — Verified fixed at api.go:601 and api.go:901,938.

  5. Dead fields removed — Verified: Path and IOFlag are gone from TraceConfig.

  6. ABI panic — Verified: mustParseABI helper added at abi/common.go:33-38.

  7. stateAtTransaction — Author is correct. The function returns state before the tx at txIndex executes, so it never needs to execute the state-sync tx itself.

  8. Resource leak fix — Verified: release() called before error return at state_accessor.go:285.

Remaining Findings

1. [MEDIUM] Live tracing: statedb.SetTxContext not called for state-sync tx

File: core/state_processor.go:166-168

In the live-tracing path, OnTxStart fires for the state-sync tx at line 168, and then Finalize processes it. However, statedb.SetTxContext(stateSyncTx.Hash(), len(txs)-1) is never called. The regular tx loop (line 126) calls statedb.SetTxContext for each tx, but continues past state-sync txs at line 117-118.

When Bor.FinalizeCommitStatesApplyMessagevmenv.Call emits logs, statedb.AddLog() stamps them with s.thash/s.txIndex from the last regular tx. The live tracer's OnLog callback sees these wrong values. The post-hoc fix at bor.go:1274 corrects TxIndex on the receipt logs but (a) TxHash is never corrected on the log objects, and (b) OnLog already fired with the wrong metadata.

The replay path (api.go:1230) correctly calls statedb.SetTxContext(txctx.TxHash, txctx.TxIndex).

Suggested fix: Add statedb.SetTxContext(txs[len(txs)-1].Hash(), len(txs)-1) inside the if hooks != nil && hasStateSyncTx block, before OnTxStart. This is a one-liner and aligns the live path with the replay path.

Fix this →


2. [LOW] BorSystemAddress duplicates types.SystemAddress

File: params/protocol_params.go:224

BorSystemAddress = common.HexToAddress("0xffffFFFfFFffffffffffffffFfFFFfffFFFfFFfE")

This is the same value as core/types/bor_receipt.go's SystemAddress. Having two sources of truth for the same address creates drift risk. Consider using one or the other consistently.


Positive Observations

  1. WrapStateSyncHooks uses struct copy — The wrapped := *inner pattern at state_sync_tracing_hooks.go:45 correctly preserves all inner hooks (block lifecycle, system call, close, etc.) while overriding only depth-bearing hooks. The test TestWrapStateSyncHooks_PassthroughForwardsAllInnerHooks validates this end-to-end. This addresses the earlier finding about dropped hooks.

  2. Proper OnTxStart/OnTxEnd lifecycle — The defer pattern at state_processor.go:170-174 ensures OnTxEnd fires on all exit paths including error returns from Finalize. The stateSyncEndErr variable correctly propagates errors to the deferred call.

  3. mustParseABI pattern — Clean init-time validation that follows go-ethereum conventions. ABI parse failures now surface immediately at startup.

  4. Strong test coverage — The wrapper unit tests (state_sync_tracing_hooks_test.go) cover multi-event, regular tx passthrough, error paths, reset between txs, nested calls, state-change hooks, nil inner hooks, and all passthrough hooks. The e2e tests use real StateReceiver bytecode and validate actual callTracer output structure. The live tracer test (TestStateSyncTracing_LiveTracerDoesNotPanic) validates the full production wiring.

  5. Clean refactoring — Removing api_bor.go, consolidating tracing into standard paths, and extracting consensus/bor/abi/ are all good improvements that reduce code duplication.

  6. Deterministic time — Using vmenv.Context.Time instead of time.Now() at processor.go:184 makes traces reproducible.


Overall this PR is in good shape. The one remaining actionable issue is the missing statedb.SetTxContext in the live-tracing path (finding #1), which is a straightforward one-line fix.


@manav2401
Copy link
Copy Markdown
Member Author

@claude regarding your findings.

  1. This will be fixed in a separate PR later - confirmed that it doesn't affect consensus.
  2. It's already deleted - please use the latest commit for review.

@claude
Copy link
Copy Markdown

claude Bot commented May 14, 2026

Claude finished @manav2401's task in 48s —— View job


Acknowledging Review Responses


Thanks for the responses, @manav2401.

Finding #1 (live tracing statedb.SetTxContext not called for state-sync tx): Acknowledged — will be addressed in a separate PR. Confirmed it doesn't affect consensus since it only impacts live tracer metadata accuracy.

Finding #2 (BorSystemAddress duplicates types.SystemAddress): Verified on commit 8ca9043 — the old types.SystemAddress in core/types/bor_receipt.go was removed by this PR and consolidated into params.BorSystemAddress. The duplication is resolved. Finding withdrawn.

No further action needed on this PR from my side.


View job

@sonarqubecloud
Copy link
Copy Markdown

@manav2401 manav2401 marked this pull request as ready for review May 14, 2026 14:47
Copilot AI review requested due to automatic review settings May 14, 2026 14:47
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 40 changed files in this pull request and generated 2 comments.

Files not reviewed (6)
  • consensus/bor/genesis_contract_mock.go: Language not supported
  • consensus/bor/span_mock.go: Language not supported
  • eth/filters/IBackend.go: Language not supported
  • eth/filters/IBatch.go: Language not supported
  • eth/filters/IDatabase.go: Language not supported
  • eth/filters/IIterator.go: Language not supported

Comment thread core/state_processor.go
Comment thread eth/tracers/api.go
@github-actions github-actions Bot removed the Stale label May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants