Skip to content

MF3-L15: fix(nitronode): fix reorg double-spend via confirmation gate#832

Open
nksazonov wants to merge 23 commits into
fix/audit-findings-finalx3from
fix/reorgs
Open

MF3-L15: fix(nitronode): fix reorg double-spend via confirmation gate#832
nksazonov wants to merge 23 commits into
fix/audit-findings-finalx3from
fix/reorgs

Conversation

@nksazonov

@nksazonov nksazonov commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Implements the reorg double-spend fix specified in nitronode/reorg-fix-spec.md (the canonical implementation spec for this PR — start there for design rationale, the residual-finality trade-off, and the per-component contracts).

The Nitronode previously credited a user's off-chain balance the instant it observed a deposit event on-chain. A subsequent reorg would remove the on-chain deposit while the off-chain credit persisted, allowing an attacker to drain that credit to a receiver before the node could detect the reorg. This PR introduces a per-chain confirmation window, a startup reconciliation walk over stored block hashes, and a reactor-level idempotency pre-check that together close the window and make replay safe.

Spec at a glance

Section Component
§2 Solution overview, §2.1 Residual risk Why a delay, what it does and doesn't guarantee
§3 Configuration New confirmation_delay_secs per chain in blockchains.yaml
§4.1–4.3 Window behavior Queue, cancel on Removed:true, replace-on-re-add
§4.4 Startup reconciliation block_hash column + common-ancestor walk via eth_getBlockByHash
§6.1–6.3 Component placement, queue keying, two-goroutine design ConfirmationGate design
§6.4 API exposure confirmation_delay_secs on node.v1.GetConfig
§6.5 Post-gate reorg detection recentlyForwarded map, WARN signal
§6.6 Reactor pre-check IsContractEventProcessed skips already-committed events
§6.7 Block timestamp cache arrivedAt from block time so historical replay matures immediately

Changes

On-chain event pipeline (pkg/blockchain/evm/)

  • ConfirmationGate (new): in-memory buffer between the Listener and reactor. Holds events for confirmation_delay_secs keyed by block timestamp; cancels on Removed:true arrivals; logs WARN on post-gate reorgs.
  • Listener: now accepts two handlers — eventHandler (live, through gate) and historicalEventHandler (Phase 1 replay, direct to reactor). Historical events bypass the gate per spec §4.4 step 5, avoiding spurious HeaderByHash calls during startup. Removed:true logs are forwarded to the live handler (the gate) instead of being silently dropped, so the gate can cancel pending entries.
  • ChannelHubReactor: pre-check via new IsContractEventProcessed(txHash, logIndex, blockchainID) (§6.6) — converts what was a constraint-violation rollback on the reconciliation replay path into a clean INFO exit.
  • findCommonAncestor (new in reconciler.go): startup walk over stored block_hash values, calling eth_getBlockByHash until it finds a still-canonical block; treats pre-migration empty hashes as canonical fallback.

Storage (nitronode/store/database/, pkg/core/)

  • New migration 20260608000000_add_block_hash_to_contract_events.sql — adds CHAR(66) NOT NULL DEFAULT '' column.
  • core.BlockchainEvent gains BlockHash; reactors populate it on every store.
  • New store methods: IsContractEventProcessed, GetLatestContractEventBlockHashAndNumber, GetPreviousDistinctBlockHash.

Config & API surface

  • core.Blockchain.ConfirmationDelaySecs (uint32) flows from blockchains.yaml → memory store → node.v1.GetConfigBlockchainInfoV1.
  • Go SDK (sdk/go) and TS SDK (sdk/ts) updated to expose the field; SDK tests updated.

Wiring (nitronode/main.go)

  • ChannelHub path: Listener → ConfirmationGate → reactor for live events; Listener → reactor directly for Phase 1.
  • LockingContract path: unchanged in behavior (no gate); same handler passed for both phases for compile parity.

Tests

  • confirmation_gate_test.go — 13 tests (T1–T13) covering delay=0 transparent path, normal queue/forward, pre-gate cancel, out-of-order re-delivery, post-gate reorg WARN, unknown removal, block-timestamp bypass, partial delay, fetcher error fallback, cache dedup, shutdown, eviction, ordering.
  • reconciler_test.go — 7 tests covering canonical, single reorg, walk-to-genesis, pre-migration latest row, pre-migration mid-walk, RPC error, no-stored-events.
  • channel_hub_reactor_test.goTestChannelHubReactor_HandleEvent_AlreadyProcessed and TestChannelHubReactor_HandleEvent_PreCheckError cover the §6.6 pre-check.
  • listener_test.goTestListener_RemovedLog_ForwardedToHandler verifies Removed-log routing; TestListener_PhaseHandlerRouting verifies Phase 1 / Phase 2 handler split.

Notes for reviewers

  • The spec text and code were reconciled in commit 44efbe45 — anywhere they currently appear to diverge, the spec is the intended source of truth and the code should match it (or both should be updated together).
  • §6.6 documents a known limitation: the idempotency pre-check uses block-level log_index which is not stable across reorgs that change the tx's position within a block. Business-logic idempotency in the existing handlers covers this case; tx-relative log_index is filed as a follow-up task in the spec.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added configurable event confirmation delay per blockchain to control event processing timing
    • Enhanced blockchain reorganization protection through block hash tracking in stored events
  • Documentation

    • Added detailed reorg-fix specification document
  • Tests

    • Added comprehensive test coverage for event confirmation and blockchain reorganization scenarios

@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a7597e19-9e25-4270-8e51-928dcd059edf

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR implements a comprehensive reorg defense mechanism for EVM event processing. It introduces a confirmation gate that delays event delivery by a configurable per-chain duration while simultaneously implementing reconciliation logic to detect and recover from chain reorganizations. The system persists block hashes, walks historical blocks to find canonical ancestors, and provides duplicate detection to prevent re-processing events during recovery.

Changes

Reorg Defense with Confirmation Gating

Layer / File(s) Summary
Database schema and persistence
nitronode/config/migrations/postgres/20260608000000_add_block_hash_to_contract_events.sql, nitronode/store/database/contract_event.go, nitronode/store/database/interface.go
Migration adds block_hash CHAR(66) column to contract_events; ContractEvent model persists BlockHash; DatabaseStore gains IsContractEventProcessed (event identity check), GetLatestContractEventBlockHashAndNumber (latest block), and GetPreviousDistinctBlockHash (predecessor block) query methods.
Reconciliation and canonicality checking
pkg/blockchain/evm/interface.go, pkg/blockchain/evm/reconciler.go, pkg/blockchain/evm/reconciler_test.go
EVMClient extended with HeaderByHash for canonical verification; ContractEventGetter gains GetLatestContractEventBlockHashAndNumber and GetPreviousDistinctBlockHash; findCommonAncestor implemented to walk stored block hashes backward via RPC, validating each against chain headers, skipping pre-migration rows, and returning safe replay start point; seven test cases cover no-stored-events, canonical latest, single/deep reorg, pre-migration edge cases, and RPC errors.
Core data model updates
pkg/core/event.go, pkg/core/types.go, pkg/rpc/types.go, nitronode/store/memory/blockchain_config.go, nitronode/store/memory/memory_store.go, sdk/go/utils.go, sdk/go/utils_test.go, sdk/ts/src/core/types.ts, sdk/ts/src/rpc/types.ts, sdk/ts/src/utils.ts
BlockchainEvent adds BlockHash field; Blockchain and BlockchainInfoV1 add ConfirmationDelaySecs (uint32, JSON-keyed); BlockchainConfig loads ConfirmationDelaySecs from YAML; NewMemoryStoreV1 wires it into core.Blockchain; RPC, Go SDK, and TypeScript SDK transformers propagate the field with 0 fallback for immediate processing.
Confirmation gate with delayed event forwarding
pkg/blockchain/evm/confirmation_gate.go, pkg/blockchain/evm/confirmation_gate_test.go
ConfirmationGate queues non-removed logs with per-block timestamps (using cached fetcher, time.Now() fallback on error); forwards matured entries when arrival + delay ≤ now; cancels queued entries on pre-gate Removed:true; records forwarded entries in TTL map for post-gate reorg detection; evicts maps when age exceeds recentMultiplier × delay; transparent mode when delay=0 (immediate forward/drop); background poller polls at configurable interval; 13 tests verify immediate/delayed forwarding, cancellation, out-of-order replacement, timestamp bypass/partial delay/caching, shutdown, post-forward removal, and ordering.
EVM listener refactoring for historical/live separation
pkg/blockchain/evm/listener.go, pkg/blockchain/evm/listener_test.go
Listener refactored to accept separate historicalEventHandler and live handler; client type changed from bind.ContractBackend to EVMClient; startup replaces GetLatestContractEventBlockNumber with findCommonAncestor call; removed logs now forwarded to handler without advancing lastBlock or invoking IsContractEventPresent; historical phase uses dedicated handler; bind import removed; tests updated to pass phase-specific handlers and mock GetLatestContractEventBlockHashAndNumber/HeaderByHash; two new tests validate phase routing and removed-log behavior.
Reactor duplicate detection and event storage
pkg/blockchain/evm/channel_hub_reactor.go, pkg/blockchain/evm/channel_hub_reactor_test.go
ChannelHubReactorStore gains IsContractEventProcessed method; ChannelHubReactor constructor accepts non-transactional store dependency; HandleEvent pre-checks event identity and returns early on duplicate with success callback (logs warning on check error but continues); StoreContractEvent now includes BlockHash; test helper newReactor sets default mock for pre-check; two new tests verify pre-check error resilience and successful duplicate skipping.
Test infrastructure updates
pkg/blockchain/evm/mock_test.go
MockEVMClient adds HeaderByHash mock; MockContractEventGetter adds GetLatestContractEventBlockHashAndNumber and GetPreviousDistinctBlockHash mocks, both delegating to testify mock framework.
Integration and event flow wiring
nitronode/main.go
Package-level blockTimestampFetchTimeout constant (10s) added; per-blockchain initialization creates blockTimestampFetcher (context-timeout header lookup), instantiates ConfirmationGate with per-chain delay, starts gate background poller; ChannelHubReactor wired with DbStore; listener uses historicalEventHandler for phase 1 and ConfirmationGate.HandleEvent for live events, comments clarify historical-bypass behavior.
Specification and documentation
nitronode/reorg-fix-spec.md
Comprehensive 374-line specification detailing reorg risk model, confirmation-window approach with residual-risk guidance, per-chain configuration (confirmation_delay_sec), event queueing/removal/out-of-order semantics, startup reconciliation via block_hash and common-ancestor walk, implementation wiring between listener/gate/reactor, two-goroutine design (pusher/poller), reorg detection via recentlyForwarded TTL map, reactor pre-checks, block timestamp caching strategy, and memory-bounded eviction.
Formatting and minor updates
nitronode/api/node_v1/utils.go
Minor indentation reformatting in mapBlockchainV1 struct literal (no logic change).

Sequence Diagram

sequenceDiagram
  participant Listener
  participant ConfirmationGate
  participant Poller
  participant ChannelHubReactor
  participant Store

  Note over Listener,Store: Event Arrival
  Listener->>ConfirmationGate: HandleEvent (live log)
  ConfirmationGate->>ConfirmationGate: Fetch block timestamp (cached)
  ConfirmationGate->>ConfirmationGate: Queue entry: (txHash, logIndex, blockHash, arrivalTime)
  Note over ConfirmationGate: Waits for confirmation delay

  Note over Listener,Store: Removed Log (pre-gate reorg)
  Listener->>ConfirmationGate: HandleEvent (removed=true)
  ConfirmationGate->>ConfirmationGate: Find & remove queued entry by (txHash, logIndex, blockHash)
  Note over ConfirmationGate: Entry canceled, not forwarded

  Note over Listener,Store: Confirmation Delay Elapsed
  Poller->>ConfirmationGate: Poll (background goroutine)
  ConfirmationGate->>ConfirmationGate: Check if arrivalTime + delay ≤ now
  ConfirmationGate->>ChannelHubReactor: Forward to handler
  ChannelHubReactor->>Store: IsContractEventProcessed check
  Store-->>ChannelHubReactor: false (not duplicate)
  ChannelHubReactor->>Store: StoreContractEvent with BlockHash
  ConfirmationGate->>ConfirmationGate: Record in recentlyForwarded TTL map

  Note over Listener,Store: Removed Log (post-gate reorg)
  Listener->>ConfirmationGate: HandleEvent (removed=true, after forward)
  ConfirmationGate->>ConfirmationGate: Check recentlyForwarded map
  ConfirmationGate->>ConfirmationGate: Log post-gate reorg warning
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • layer-3/nitrolite#569: Extensive updates to EVM listener and test scaffolding directly supporting the new listener interface and event-handling changes.
  • layer-3/nitrolite#658: Modifies EVM listener and contract-event getter to change historical/live processing and reorg handling semantics.

Suggested reviewers

  • philanton
  • dimast-x

Poem

🐰 A gate stands tall where logs do queue,
Each block's timestamp a passing clue,
When reorgs strike and chains split wide,
The ancestor walk brings truth inside,
No more lost events, the ledger's true! 🌳

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 59.18% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(nitronode): fix reorg double-spend via confirmation gate' accurately summarizes the main change: implementing a confirmation gate mechanism to prevent reorg-induced double-spend attacks. It clearly conveys the primary objective without being vague or overly broad.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/reorgs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@nksazonov nksazonov changed the title fix(nitronode): close reorg double-spend window via confirmation gate MF3-L15: fix(nitronode): fix reorg double-spend via confirmation gate Jun 8, 2026

@ihsraham ihsraham left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting the confirmation-gate path together. The uninterrupted live path is moving in the right direction: Removed logs now reach the gate, pending logs can be cancelled before processing, and the tests cover the basic pre-gate cancellation cases.

I would not close L-15 on this version yet. A few paths still either bypass the confirmation window or weaken the old listener retry behavior, so a reorg/restart or transient reactor failure can still leave the node in the wrong state. I left the closure blockers inline.

ChannelHubAddress string `yaml:"channel_hub_address"`
// ChannelHubSigValidators maps validator IDs to the addresses of signature validators for the ChannelHub contract on this blockchain
ChannelHubSigValidators map[uint8]string `yaml:"channel_hub_sig_validators"`
// ConfirmationDelaySecs is the number of seconds to wait before processing an event.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This default leaves the mitigation off unless every deployed config is updated. ConfirmationDelaySecs stays at 0 when the YAML key is omitted, and 0 makes the gate transparent. The checked-in prod/sandbox/stress blockchains.yaml files do not set confirmation_delay_secs, so the deployed node would still process live deposits immediately.

For L-15 closure, please make the confirmation delay non-zero in shipped configs or enforce a safe default/validation for enabled EVM chains.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will set those for sure before the merge

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rechecked current HEAD (3512884). This still looks open in the diff: the checked-in prod-v1, sandbox-v1, and stress-v1 blockchains.yaml files still omit confirmation_delay_secs, and omission still decodes as 0, so main.go takes the no-gate path.

A pre-merge config change is fine operationally, but for the audit closure I would want this represented in the PR itself or enforced through validation/defaults. Also, nitronode/docs/reorg-fix.md currently documents confirmation_delay_sec singular, while the YAML tag/API field are confirmation_delay_secs plural. Following the doc as written would also leave the field unset.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still open on current head (1bc0d565). The field-name/docs mismatch is fixed now, but the shipped chart configs still do not set confirmation_delay_secs.

Since the loader leaves the omitted field as 0 and main.go skips NewConfirmationGate when the delay is zero, the prod/sandbox/stress deploy path still processes live deposits immediately. The deploy workflow also passes those exact files via --set-file=config.blockchains=.../blockchains.yaml, so I would keep this open until the PR either sets non-zero values in those configs or enforces a safe default/validation for ChannelHub-enabled chains.

Comment thread nitronode/main.go Outdated
Comment thread pkg/blockchain/evm/reconciler.go Outdated
// when the handler or the event-presence check fails.
func (l *Listener) listenEvents(ctx context.Context) error {
lastBlock, err := l.eventGetter.GetLatestContractEventBlockNumber(l.contractAddress.String(), l.blockchainID)
lastBlock, err := findCommonAncestor(ctx, l.client, l.eventGetter, l.contractAddress.String(), l.blockchainID, l.logger)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new return value is later treated the same as an empty store, but findCommonAncestor also uses 0 when every stored block was reorged out. In that full-depth reorg case the listener skips historical replay entirely, leaving stale event-derived DB state in place instead of rebuilding from a canonical range.

Please distinguish the empty-store case from the all-stored-blocks-reorged case, or replay from the configured start/genesis boundary when reconciliation returns this because no stored block is still canonical.

@nksazonov nksazonov Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged in 2ead611: added a code comment at the prevHash == "" branch and a §4.4 note explaining the empty-store / full-depth-reorg conflation is intentional; full-depth reorg to genesis is treated as a chain-level incident outside the gate's scope

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reading this after the latest commits, I do not think the code comment quite matches the operational case. This branch is reached whenever every stored event block is no longer canonical. That is not necessarily a chain-level reorg to genesis: a fresh node, or a chain with only one processed ChannelHub event, can hit it on a shallow reorg of that one event block.

In that case findCommonAncestor returns 0, and listenEvents treats lastBlock == 0 as “skip historical logs fetching”, so canonical replacement logs that already exist between the start boundary and the subscription tip are missed. For L-15 closure I would still prefer distinguishing empty store from “no stored event block is canonical” and replaying from a configured start/genesis boundary for the latter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 4709879: findCommonAncestor now splits the all-walk-exhausted branch — empty store (latestNum == 0) still skips replay; non-empty store with all blocks reorged (prevNum == 0 && prevHash == "" && latestNum > 0) returns the original latestNum as the resume anchor, so the listener re-fetches canonical replacements via eth_getLogs from the orphaned height (the orphan hash is discarded). Empty-store and all-reorged are no longer conflated.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The split helps, but I think there is still one replay boundary left here. In the all-stored-blocks-reorged branch we now return latestNum; if the canonical replacement block is already the current tip at that same height, listenEvents calls reconcileBlockRange(currentBlock, lastBlock) with equal values and reconcileBlockRange never enters for currentBlock > startBlock.

That means the replacement log at the resume height is not fetched. Could we make the replay range include the resume height for this case, or return a start height that guarantees the orphaned height is queried?

Comment thread pkg/blockchain/evm/confirmation_gate.go Outdated
@nksazonov

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
pkg/blockchain/evm/mock_test.go (1)

123-129: ⚡ Quick win

Add doc comments to newly added exported mock methods.

These exported methods should include Go doc comments to comply with repository Go guidelines.

Suggested patch
+// HeaderByHash mocks EVMClient.HeaderByHash.
 func (m *MockEVMClient) HeaderByHash(ctx context.Context, hash common.Hash) (*types.Header, error) {
 	args := m.Called(ctx, hash)
 	if args.Get(0) == nil {
 		return nil, args.Error(1)
 	}
 	return args.Get(0).(*types.Header), args.Error(1)
 }

+// GetLatestContractEventBlockHashAndNumber mocks ContractEventGetter.GetLatestContractEventBlockHashAndNumber.
 func (m *MockContractEventGetter) GetLatestContractEventBlockHashAndNumber(contractAddress string, blockchainID uint64) (uint64, string, error) {
 	args := m.Called(contractAddress, blockchainID)
 	return args.Get(0).(uint64), args.String(1), args.Error(2)
 }

+// GetPreviousDistinctBlockHash mocks ContractEventGetter.GetPreviousDistinctBlockHash.
 func (m *MockContractEventGetter) GetPreviousDistinctBlockHash(contractAddress string, blockchainID uint64, belowBlockNumber uint64) (uint64, string, error) {
 	args := m.Called(contractAddress, blockchainID, belowBlockNumber)
 	return args.Get(0).(uint64), args.String(1), args.Error(2)
 }

As per coding guidelines, “Run gofmt and write doc comments on all exported Go functions.”

Also applies to: 146-154

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/blockchain/evm/mock_test.go` around lines 123 - 129, Add Go doc comments
for the newly exported mock methods so they comply with repository guidelines:
document MockEVMClient.HeaderByHash (and the other exported mock methods
referenced around lines 146-154) with a brief sentence starting with the method
name, describing its purpose and behavior in tests (e.g., that it returns a
mocked *types.Header or an error via the testify mock). Ensure each comment is
placed immediately above the corresponding method and follows Go doc comment
style ("HeaderByHash ...").

Source: Coding guidelines

nitronode/reorg-fix-spec.md (1)

319-323: 💤 Low value

Add language identifier to fenced code block.

The fenced code block lacks a language identifier. While this is pseudocode rather than a specific programming language, adding an identifier (e.g., text or pseudocode) improves rendering and satisfies the markdownlint rule.

✨ Suggested fix
-```
+```text
 tx_log_index = l.Index - min(l.Index for all logs of l.TxHash in this block)

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @nitronode/reorg-fix-spec.md around lines 319 - 323, The fenced code block
containing the pseudocode "tx_log_index = l.Index - min(l.Index for all logs of
l.TxHash in this block)" needs a language identifier to satisfy markdownlint and
improve rendering; update the block so the opening fence includes a language tag
such as "text" or "pseudocode" (e.g., ```text) while leaving the content and
symbols like tx_log_index, l.Index and l.TxHash unchanged.


</details>

<!-- cr-comment:v1:076974948adca715997203cf -->

_Source: Linters/SAST tools_

</blockquote></details>
<details>
<summary>pkg/blockchain/evm/interface.go (1)</summary><blockquote>

`47-48`: _⚡ Quick win_

**Clarify the canonicality verification limitation.**

The comment states that `HeaderByHash` is used "to verify block canonicality," but `HeaderByHash` alone does not prove a block is in the canonical chain—it only proves the block exists in the node's database. Depending on the backend, a reorged block may still be retrievable by hash if the node retains side-chain headers temporarily. The reconciler implementation must compare the retrieved header's hash with the canonical header at that block number (via `HeaderByNumber`) to confirm canonicality.

Consider updating the comment to reflect that `HeaderByHash` is used as part of canonicality verification, not as the sole check.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/blockchain/evm/interface.go` around lines 47 - 48, Update the comment on
HeaderByHash to clarify it is only one part of canonicality checks: state that
HeaderByHash retrieves a header by hash (verifies the block exists in the node
DB) but does not alone prove inclusion in the canonical chain, and that callers
(e.g., the reconciler) must compare the retrieved header against the canonical
header at that block number using HeaderByNumber to confirm canonicality;
reference HeaderByHash and HeaderByNumber by name in the comment.
```

</details>

<!-- cr-comment:v1:3b71ed1b23f7e01b1ace884d -->

</blockquote></details>
<details>
<summary>pkg/blockchain/evm/reconciler_test.go (1)</summary><blockquote>

`17-24`: _🏗️ Heavy lift_

**Consider adding a test case for side-chain header detection.**

Once the reconciler is updated to compare `HeaderByHash` results with `HeaderByNumber` (see the critical issue flagged in `reconciler.go`), add a test case where:
- `HeaderByHash(storedHash)` returns a non-nil header (side-chain block still in database)
- `HeaderByNumber(blockNum)` returns a different canonical hash
- Expected behavior: the reconciler walks backward instead of accepting the side-chain block

This will ensure the canonicality verification logic is fully covered.

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/blockchain/evm/reconciler_test.go` around lines 17 - 24, Add a unit test
in reconciler_test.go that verifies side-chain header detection by stubbing the
blockchain client so HeaderByHash(storedHash) returns a valid header but
HeaderByNumber(blockNum) returns a different canonical hash; call the
reconciler's reconcile method (Reconciler.Reconcile or the method under test in
reconciler.go) with the stored header and assert that the reconciler does not
accept the side-chain block and instead walks backward/reattempts canonical
verification (i.e., triggers the same backward-walking code path used when
HeaderByNumber disagrees with HeaderByHash). Use the existing test constants
(testContract, testBlockchainID) and new mocks for HeaderByHash and
HeaderByNumber to ensure the canonicality branch is exercised.
```

</details>

<!-- cr-comment:v1:a98befbcb6b90fb3c13f98f5 -->

</blockquote></details>
<details>
<summary>pkg/blockchain/evm/listener_test.go (1)</summary><blockquote>

`350-353`: _⚡ Quick win_

**Use deterministic signaling instead of fixed sleeps in these new tests.**

These `time.Sleep(...)` cancels are timing-sensitive and can become flaky in slower CI workers. Prefer channel/WaitGroup-driven cancellation once expected events are observed.







Also applies to: 401-405

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/blockchain/evm/listener_test.go` around lines 350 - 353, Replace the
timing-sensitive goroutine that calls time.Sleep(...) then cancel() with
deterministic signaling: have the test wait for the expected event(s) using a
channel or sync.WaitGroup (e.g., an observedEventCh or wg) and call cancel()
only after that signal is received; update both occurrences around the goroutine
that currently sleeps and calls cancel() (and the similar block at the 401-405
region) so tests drive cancellation explicitly when the observed condition
completes rather than relying on fixed sleeps.
```

</details>

<!-- cr-comment:v1:0cc26647a8308c263f4d29e7 -->

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @nitronode/reorg-fix-spec.md:

  • Line 242: The statement claiming "Zero RPC calls in the gate" is incorrect
    because the gate uses blockTimestampFetcher which issues eth_getBlockByHash RPC
    calls (one per unique block, with caching); update the text to remove the
    absolute "Zero RPC calls" claim and instead state that the gate does not perform
    chain queries per request but does fetch block timestamps via
    blockTimestampFetcher using eth_getBlockByHash with caching to minimize RPCs.
    Ensure you reference the gate and blockTimestampFetcher behavior and mention the
    caching semantics.

In @nitronode/store/memory/blockchain_config.go:

  • Around line 46-49: verifyBlockchainsConfig currently accepts any uint32 for
    ConfirmationDelaySecs which can pause processing; add bounds checking in
    verifyBlockchainsConfig to enforce a valid range (0..780 seconds) by returning
    an error if blockchainConfig.ConfirmationDelaySecs is greater than the maximum
    meaningful value (use a named constant like MaxConfirmationDelaySecs = 780).
    Ensure you update any error messages to include the field name
    (ConfirmationDelaySecs) and the invalid value so callers can correct configs.

Nitpick comments:
In @nitronode/reorg-fix-spec.md:

  • Around line 319-323: The fenced code block containing the pseudocode
    "tx_log_index = l.Index - min(l.Index for all logs of l.TxHash in this block)"
    needs a language identifier to satisfy markdownlint and improve rendering;
    update the block so the opening fence includes a language tag such as "text" or
    "pseudocode" (e.g., ```text) while leaving the content and symbols like
    tx_log_index, l.Index and l.TxHash unchanged.

In @pkg/blockchain/evm/interface.go:

  • Around line 47-48: Update the comment on HeaderByHash to clarify it is only
    one part of canonicality checks: state that HeaderByHash retrieves a header by
    hash (verifies the block exists in the node DB) but does not alone prove
    inclusion in the canonical chain, and that callers (e.g., the reconciler) must
    compare the retrieved header against the canonical header at that block number
    using HeaderByNumber to confirm canonicality; reference HeaderByHash and
    HeaderByNumber by name in the comment.

In @pkg/blockchain/evm/listener_test.go:

  • Around line 350-353: Replace the timing-sensitive goroutine that calls
    time.Sleep(...) then cancel() with deterministic signaling: have the test wait
    for the expected event(s) using a channel or sync.WaitGroup (e.g., an
    observedEventCh or wg) and call cancel() only after that signal is received;
    update both occurrences around the goroutine that currently sleeps and calls
    cancel() (and the similar block at the 401-405 region) so tests drive
    cancellation explicitly when the observed condition completes rather than
    relying on fixed sleeps.

In @pkg/blockchain/evm/mock_test.go:

  • Around line 123-129: Add Go doc comments for the newly exported mock methods
    so they comply with repository guidelines: document MockEVMClient.HeaderByHash
    (and the other exported mock methods referenced around lines 146-154) with a
    brief sentence starting with the method name, describing its purpose and
    behavior in tests (e.g., that it returns a mocked *types.Header or an error via
    the testify mock). Ensure each comment is placed immediately above the
    corresponding method and follows Go doc comment style ("HeaderByHash ...").

In @pkg/blockchain/evm/reconciler_test.go:

  • Around line 17-24: Add a unit test in reconciler_test.go that verifies
    side-chain header detection by stubbing the blockchain client so
    HeaderByHash(storedHash) returns a valid header but HeaderByNumber(blockNum)
    returns a different canonical hash; call the reconciler's reconcile method
    (Reconciler.Reconcile or the method under test in reconciler.go) with the stored
    header and assert that the reconciler does not accept the side-chain block and
    instead walks backward/reattempts canonical verification (i.e., triggers the
    same backward-walking code path used when HeaderByNumber disagrees with
    HeaderByHash). Use the existing test constants (testContract, testBlockchainID)
    and new mocks for HeaderByHash and HeaderByNumber to ensure the canonicality
    branch is exercised.

</details>

<details>
<summary>🪄 Autofix (Beta)</summary>

Fix all unresolved CodeRabbit comments on this PR:

- [ ] <!-- {"checkboxId": "4b0d0e0a-96d7-4f10-b296-3a18ea78f0b9"} --> Push a commit to this branch (recommended)
- [ ] <!-- {"checkboxId": "ff5b1114-7d8c-49e6-8ac1-43f82af23a33"} --> Create a new PR with the fixes

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: defaults

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `cb3e38c7-cee9-4e41-abb7-c1eb13c2e53f`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 8eea5664c4380dff4bd14c67f25eefad737cae67 and 256463668d96c054e7a8301d7c1a9b4dfd6440b0.

</details>

<details>
<summary>⛔ Files ignored due to path filters (1)</summary>

* `sdk/ts/test/unit/__snapshots__/public-api-drift.test.ts.snap` is excluded by `!**/*.snap`

</details>

<details>
<summary>📒 Files selected for processing (26)</summary>

* `nitronode/api/node_v1/utils.go`
* `nitronode/config/migrations/postgres/20260608000000_add_block_hash_to_contract_events.sql`
* `nitronode/main.go`
* `nitronode/reorg-fix-spec.md`
* `nitronode/store/database/contract_event.go`
* `nitronode/store/database/interface.go`
* `nitronode/store/memory/blockchain_config.go`
* `nitronode/store/memory/memory_store.go`
* `pkg/blockchain/evm/channel_hub_reactor.go`
* `pkg/blockchain/evm/channel_hub_reactor_test.go`
* `pkg/blockchain/evm/confirmation_gate.go`
* `pkg/blockchain/evm/confirmation_gate_test.go`
* `pkg/blockchain/evm/interface.go`
* `pkg/blockchain/evm/listener.go`
* `pkg/blockchain/evm/listener_test.go`
* `pkg/blockchain/evm/mock_test.go`
* `pkg/blockchain/evm/reconciler.go`
* `pkg/blockchain/evm/reconciler_test.go`
* `pkg/core/event.go`
* `pkg/core/types.go`
* `pkg/rpc/types.go`
* `sdk/go/utils.go`
* `sdk/go/utils_test.go`
* `sdk/ts/src/core/types.ts`
* `sdk/ts/src/rpc/types.ts`
* `sdk/ts/src/utils.ts`

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment thread nitronode/reorg-fix-spec.md Outdated
Comment thread nitronode/store/memory/blockchain_config.go
Comment thread pkg/blockchain/evm/reconciler.go Outdated
Comment thread pkg/blockchain/evm/reconciler.go
Comment thread pkg/blockchain/evm/listener.go Outdated
Comment thread nitronode/store/database/contract_event.go
Comment thread nitronode/main.go Outdated
Comment thread nitronode/docs/reorg-fix.md
Comment thread pkg/blockchain/evm/channel_hub_reactor.go Outdated
Comment thread pkg/blockchain/evm/channel_hub_reactor.go Outdated
Comment thread pkg/blockchain/evm/channel_hub_reactor.go Outdated
Comment thread pkg/blockchain/evm/confirmation_gate.go
nksazonov added 11 commits June 10, 2026 16:38
Reworks the gate test suite for the new ConfirmationGate API
(no fetcher arg, error-returning constructor, BlockTimestamp-driven
arrivedAt) and adds coverage for the new behaviors: tombstone-skip,
FIFO-eviction with early-delete tolerance, single-timer reschedule,
kick-during-pending-timer, shutdown-with-non-empty-queue.

Adds listener_test cases for the new ensureBlockTimestamp helper
(populated path, fetch path, single-entry cache hit, fetch error)
and updates existing listener tests that previously fed events with
BlockTimestamp == 0.

Updates reorg-fix-spec.md sections 4.1, 4.3, 6.1, 6.2, 6.3, 6.5,
6.6, 6.7 to describe the timer+kick design, tombstone-map queue,
forwardedSet/forwardedQueue + FIFO eviction with value-check, and
the listener-owned ensureBlockTimestamp fallback (replacing the
gate's removed block-timestamp fetcher/cache).

@ihsraham ihsraham left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed current head. The age-based historical routing, canonical HeaderByNumber check, and fatal error propagation look addressed. I am not approving yet because the checked-in config still leaves the gate disabled, and I left one more no-gate Removed:true path note inline.

Comment thread pkg/blockchain/evm/listener.go
Comment thread pkg/blockchain/evm/confirmation_gate.go Outdated

@ihsraham ihsraham left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes for L-15 closure. The main implementation is much closer now: no-gate Removed:true logs no longer reach the reactor, recent historical events route through the gate, canonicality uses HeaderByNumber, and handler failures after dequeue now restart instead of being swallowed. I resolved/reacted on the threads that look addressed.

I would still keep L-15 open on three points: shipped configs still decode to confirmation_delay_secs: 0, pending gate entries can survive a reconnect if the removal signal is missed, and the all-stored-blocks-reorged replay path still misses the same-height replacement boundary. I replied on the existing config/full-depth threads and left one new inline comment for the reconnect case.

l.logger.Debug("skipping already present current event", "blockchainID", l.blockchainID, "blockNumber", eventLog.BlockNumber, "logIndex", eventLog.Index)
continue
if !eventLog.Removed {
*lastBlock = eventLog.BlockNumber

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still leaves a reconnect gap for pending gate entries. The listener advances lastBlock before the gate has committed the event, and HandleEvent only queues it. If the subscription drops after an orphaned live log is queued but before the matching Removed:true arrives, the next loop replays only canonical logs and never sends a cancellation for that pending entry. The old pending entry can then mature in the gate and reach the reactor even though its block is no longer canonical.

Could we keep the resume cursor tied to committed events, clear/reconcile pending gate entries on reconnect, or have replay explicitly cancel queued entries that no longer appear canonically before they can mature?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants