Indexer resilience & reorg recovery#262
Open
mikkyvans0-source wants to merge 3 commits into
Open
Conversation
… challenge storage
…lenge verification and JWT issuance
…nsactional atomicity, and event idempotency support
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The ticket named IndexerService.ts, EventProcessor.ts, and a "Prisma schema," but this project uses TypeORM and the real, canonical DB-backed indexer is BlockchainIndexerService. I mapped the work onto it. All three objectives are met, and the suite is green: 63 passed across 8 blockchain/indexer test suites.
Closes #226
Objectives delivered
Block sequential queue — new SequentialQueue. Both processEvent and reorg rollback are routed through it, so blocks persist strictly in order and a rollback can never interleave with a newer block's transaction (a failing task doesn't stall the queue).
Transactional block persistence — the checkpoint is now saved via queryRunner.manager inside the same transaction as the event + balance mutations, so they commit/roll back atomically. This also fixed a real bug: replayFromBlock deleted orphaned ProcessedEvent rows but never reversed the TokenBalance changes — exactly the ticket's "stale balances / failed rollback." Reorg recovery now reverses balances, deletes orphaned events (>= startBlock), and rewinds the checkpoint, all in one transaction. A nullable payload column (entity + migration) stores the decoded event so the exact mutation can be inverted.
Intelligent RPC backoff — new withRpcBackoff: exponential backoff + full jitter, detecting HTTP 429 across ethers/web3/fetch shapes and transient 5xx/network codes, while not retrying deterministic errors (reverts, 400s). Wired into the real RPC call sites: the indexer's getLogs and the rewards listener's queryFilter.
Acceptance criteria — proven by tests
10-block reorg recovery: integration spec against a real in-memory SQLite DB indexes 10 blocks, reorgs from block 6, asserts balances revert (Alice 900→950, Bob 100→50) and checkpoint rewinds to 5, then re-indexes cleanly with no double-counting.
Idempotency via duplicate logs: same spec processes a duplicate event 3× → applied once, one row.
Backoff handles 429: rpc-backoff spec covers retry-then-succeed, exponential timing, give-up-after-max, and no-retry-on-client-error.
Two things you should know
A test conflict resolved by the ticket. Two existing specs encoded opposite contracts for the checkpoint: blockchain-indexer.spec.ts wanted post-commit save (passing), while blockchain-indexer.service.spec.ts:88 wanted it inside the transaction (failing). Objective #2 settles it in favor of transactional — I updated the contradicting test accordingly, and rewrote the replay specs to the new transactional + balance-reversal behavior (preserving the >= startBlock regression guard).
Pre-existing broken build (not mine). A full tsc/nest build fails on pre-existing syntax errors in unrelated files (claims/evidence.service.ts, some audit/claims specs); the repo gets away with it because ts-jest compiles per-file. While there, I fixed one such error in the reorg subsystem — reorg-detector.service.ts had a missing brace that put return inside the while loop. I left the unrelated claims/audit errors alone as out of scope; flag me if you'd like those cleaned up too. I did not touch package.json.