Skip to content

tbc: store tx byte location in tx index, bump DB to v6#1052

Open
marcopeereboom wants to merge 1 commit into
mainfrom
marco/tx-offset
Open

tbc: store tx byte location in tx index, bump DB to v6#1052
marcopeereboom wants to merge 1 commit into
mainfrom
marco/tx-offset

Conversation

@marcopeereboom

Copy link
Copy Markdown
Contributor

Summary

Store tx byte location (TxLoc: offset + length within raw block) in the tx index 't' entry value, which was previously nil. This enables O(1) tx lookup by jumping directly to the tx's bytes in the raw block — no scanning, no SHA256 hashing, no full block deserialization.

Depends on #1051 (lazy block reader).

Problem

Every tx lookup via BlockHashByTxId requires a subsequent full block deserialization to find the tx. The 't' entry value was nil — wasted space that could carry the byte offset. CPU profile shows 60% in SHA256 hashing from FindTx scanning every tx in the block to find a match.

Solution

  • BlockHashByTxId signature changed to return (*chainhash.Hash, wire.TxLoc, error) — all callers updated
  • processTxs now calls block.TxLoc() and stores offset+length via NewTxMappingWithLoc
  • BlockTxUpdate uses stack-allocated reusable buffers instead of slicing loop variables (addresses potential data integrity issue documented in tbc: tx index intermittently loses entries during IBD #1050)
  • All consumers wired to use TxLoc when available, with legacy fallback for nil values:
    • TxById (RPC) — deserializes only the target tx
    • txOutFromOutPoint (UTXO unwind) — deserializes only the target tx
    • handleBlockHashByTxIdRequest (RPC) — hash only, ignores TxLoc
    • hemictl — hash only, ignores TxLoc
  • DB version 5 → 6, upgrade wipes tx index for rebuild with TxLoc values
  • Errors from block.TxLoc() logged at Errorf, falls back to nil values

Testing

  • TestDbUpgradeV6 — seeds v5 DB, runs upgrade, verifies index wiped and version bumped
  • TestTxLocRoundTrip — stores TxLoc, reads back, verifies offset+length match
  • TestTxLocOffsetCorrectness — stores raw block, uses offset to extract tx bytes, deserializes, verifies txid and output values match
  • TestTxLocLegacyNilValue — nil-value entry returns zero TxLoc gracefully

Impact

With TxLoc, each cache miss costs: 1 LevelDB read (tx index) + 1 LevelDB/cache read (raw block) + parse ~200 bytes of one tx. No full block deserialization. No SHA256 scanning. Eliminates the need for parallel lookup strategies.

Related

Files changed

  • database/tbcd/database.goBlockHashByTxId returns TxLoc, NewTxMappingWithLoc
  • database/tbcd/level/level.go — implementation, stack buffers, DB v6
  • database/tbcd/level/level_test.go — 4 new tests
  • database/tbcd/level/upgrade.gov6() wipes tx index
  • service/tbc/txindex.go — stores TxLoc
  • service/tbc/tbc.goTxById uses TxLoc
  • service/tbc/utxoindex.gotxOutFromOutPoint uses TxLoc
  • service/tbc/rpc.go — caller updated
  • service/tbc/cpfp_test.go — stub updated
  • service/tbc/tbc_test.go — version expectations updated
  • cmd/hemictl/hemictl.go — caller updated

@marcopeereboom marcopeereboom requested a review from a team as a code owner May 29, 2026 07:10
This was referenced May 29, 2026
Comment thread database/tbcd/level/upgrade.go
@marcopeereboom marcopeereboom force-pushed the marco/tx-offset branch 2 times, most recently from 1a04a83 to a97fdbd Compare June 2, 2026 07:26
@marcopeereboom

Copy link
Copy Markdown
Contributor Author

Don't merge yet!

@AL-CT AL-CT left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking

@marcopeereboom marcopeereboom force-pushed the marco/lazy-block branch 2 times, most recently from 1dba573 to ffd4579 Compare June 10, 2026 14:35
Base automatically changed from marco/lazy-block to main June 10, 2026 16:39
@marcopeereboom marcopeereboom force-pushed the marco/tx-offset branch 2 times, most recently from 720c095 to 4d15662 Compare June 10, 2026 16:54
@github-actions github-actions Bot added area: hemictl This is a change to hemictl area: tbc This is a change to TBC (Tiny Bitcoin) area: docs This is a change to documentation changelog: done This pull request includes an appropriate update to CHANGELOG.md. labels Jun 10, 2026
@codecov

codecov Bot commented Jun 10, 2026

Copy link
Copy Markdown

Comment thread database/tbcd/level/level.go
Comment thread database/tbcd/level/level.go Outdated
Store TxLoc (offset + length within raw block) in the t entry value
instead of nil. This allows callers to jump directly to a tx's bytes
in the raw block without scanning — O(1) instead of O(txs_in_block).

BlockHashByTxId now returns (*chainhash.Hash, wire.TxLoc, error).
All callers updated. No separate method needed — callers that only
need the hash use bh, _, err := BlockHashByTxId(...).

processTxs calls block.TxLoc() and stores the location via
NewTxMappingWithLoc. Errors from TxLoc() are logged at Errorf
and the indexer falls back to nil values (legacy format).

BlockTxUpdate uses stack-allocated reusable buffers instead of
slicing loop variables. The previous code sliced the range variable
and passed the slice to leveldb.Batch.Put. appendRec copies
immediately, but the interaction between range variable reuse,
map deletion, and GC is not guaranteed safe. Stack buffers are
zero-alloc and independent per iteration.

DB version 5 -> 6. Upgrade path wipes the transactions index for
rebuild with TxLoc values. The index is fully derived from block data.

Ref: #1050
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: docs This is a change to documentation area: hemictl This is a change to hemictl area: tbc This is a change to TBC (Tiny Bitcoin) changelog: done This pull request includes an appropriate update to CHANGELOG.md.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants