Skip to content

xdr,xdrgen,ingest: extend xdr views (decoded discriminants, typed opaque, count validation, Fields) and rebuild the view extractors on them#5951

Merged
tamirms merged 2 commits into
stellar:mainfrom
tamirms:views-minimal
Jun 16, 2026
Merged

Conversation

@tamirms

@tamirms tamirms commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

Builds on #5949 (the zero-copy XDR view extractors for full-history ingestion). It adds four small, additive features to the experimental xdr views API and rewrites the ingest view extractors to use them, cutting allocations sharply, speeding up full-ledger materialization, and making the extractor code easier to read. The public extractor surface is unchanged.

What's in it

xdr views API (xdr, xdrgen) — four additive codegen features; Iter()/All()/At()/Raw() are unchanged:

  • Decoded discriminants — a union's discriminant accessor (e.g. V()) now returns the decoded value directly (the Go enum, or int32/bool) instead of a leaf view, dropping the two-step V().Value(). Enum discriminants are validated against the known case set; int discriminants are not, so default arms stay reachable.
  • Typed fixed-opaque Value() — returns the typed array (e.g. Hash, [N]byte) by value instead of []byte.
  • Array count validationCount()/All()/Iter() validate the wire element count against the remaining buffer up front (per-type minimum element size), closing an allocation-amplification hole where a tiny buffer declaring a huge count could drive a large allocation. The check is confined to these allocation/iteration entry points; the internal size/validate walks keep the cheap count.
  • Fields() — a per-struct method that locates every field in a single walk and returns a bundle of trimmed sub-views, enabling fused "advance + locate" iteration over an array of structs.

Extractor rewrite (ingest, network) — public signatures unchanged (ExtractTxHashes, ExtractLedgerEvents, LedgerTransactionViewByHash/Range, network.TransactionViewHasher):

  • Each TxProcessing element is projected to {Result, TxApplyProcessing} via one Fields() walk, advancing by reslicing past the located node — so locating fields and advancing the iterator are a single walk, and the meta's raw bytes are a free trimmed slice instead of a re-walk.
  • Envelope-by-hash pairing hashes first and extracts an envelope's type/soroban/raw details only after it matches a wanted hash, so a by-hash lookup or small page no longer detail-reads the rest of the TxSet.
  • The V0/V1/V2 dispatch is resolved once into version-agnostic handles, decoded discriminants replace the two-step reads, the envelope tree walk's bool-threaded callbacks become plain iter.Seq2 loops, and the event walkers use Must/Try instead of per-accessor error ladders.

Performance

Real pubnet ledger; benchstat, n=16, base and PR runs interleaved so machine drift cancels (verified: the unchanged full_decode control paths show no significant difference, p≥0.27). Allocations are exact.

extractor allocs/op (base → PR) time
ExtractTxHashes 267 → 17 (−94%) no significant change
LedgerTransactionViewByHash (late) 279 → 45 (−84%) −10% (p=0.000)
LedgerTransactionViewByHash (mid) 148 → 32 (−78%) −9% (p=0.000)
ExtractLedgerEvents 386 → 136 (−65%) −6% (p=0.000)
LedgerTransactionViewRange (full ledger) 1010 → 776 (−23%) −22% (p=0.000, ≈1.29×)
LedgerTransactionViewRange (page of 10) 39 → 37 (−5%) −5% (p=0.001)
LedgerTransactionViewByHash (early) 34 → 32 (−6%) no significant change

The headline is the allocation reductions (exact) and the −22% on full-ledger materialization. ExtractTxHashes and small point-lookups are allocation-lighter but the same speed — those paths are dominated by XDR sizing walks, not allocation.

Compatibility

  • Public ingest/network extractor API: unchanged.
  • Experimental xdr views API: breaking. Discriminant accessors and fixed-opaque Value() change return types. The views are marked experimental and have no external consumers yet, so this is intentional and isolated.
  • Behavioral nuance: envelope validation narrows from "every envelope in the TxSet" to "only the envelopes actually paired by hash." For trusted LCM input (the only source) these are equivalent.

Testing

  • Real-ledger equivalence tests (view extractors vs. the parsed reader) across LCM V0/V1/V2 — unchanged and passing.
  • New TestView_RandXDR_Fields (randxdr differential confirming Fields() trims each located field to its exact wire extent) and TestArrayCountBufferBound (count-validation regression guard).
  • xdrgen golden updated; full xdr/ingest/xdrgen suites pass.

Notes for reviewers

The diff is large but ~13k lines are the regenerated xdr/xdr_views_generated.go (Fields() is generated for every struct); the hand-written change is ~1.1k lines. The substance is in xdrgen/ (emitters) and ingest/.

@tamirms tamirms force-pushed the views-minimal branch 2 times, most recently from 9f6539c to 1d4f020 Compare June 15, 2026 14:37
@tamirms tamirms marked this pull request as ready for review June 15, 2026 14:57
Copilot AI review requested due to automatic review settings June 15, 2026 14:57

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Extends the experimental XDR “views” API with decoded union discriminants, typed fixed-opaque Value(), array count buffer validation, and per-struct Fields() bundles; then rewrites ingestion/network extractors to use these features for fewer allocations and faster ledger materialization.

Changes:

  • Codegen: emit decoded union discriminant accessors, fixed-opaque typed Value(), array count buffer bounds checks, and struct Fields() bundle locate helpers.
  • Ingest/network: refactor extractors and envelope/enumeration walks to use Fields(), decoded discriminants, and fewer detail reads.
  • Tests/docs: update docs and add regressions for Fields() trimming and array count validation.

Reviewed changes

Copilot reviewed 21 out of 22 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
xdrgen/views_api.md Documents Fields(), decoded discriminants, typed fixed-opaque Value(), and count validation behavior.
xdrgen/view_union.go Generates decoded discriminant accessor (V() returns enum/int32/bool) and MustV().
xdrgen/view_support.go.txt Adds arrayViewCountChecked helper for buffer-bound count validation.
xdrgen/view_struct.go Generates per-struct Fields() bundle type + locate helper.
xdrgen/view_plan.go Adds plan metadata for discriminant emission, array element min-size, and fixed-opaque value Go type.
xdrgen/view_minsize.go Implements minimum wire-size computation to support O(1) array count validation.
xdrgen/view_emitters.go Threads array min-size + opaque value type into emitters; adjusts concrete-type emission to return errors.
xdrgen/type_resolver.go Rejects opaque[0] and prevents uint32 overflow when computing fixed array size.
xdrgen/testdata/mini_views.golden Updates golden output for new helpers, typed opaque Value(), and Fields() generation.
xdrgen/gen_views.go Propagates emitter errors during view generation.
xdr/xdr_view_test.go Updates tests for decoded discriminant accessor return type.
xdr/view_randxdr_test.go Updates discriminant test and adds Fields() differential trimming test.
xdr/view_count_validation_test.go Adds regression test for count-vs-buffer validation in Count().
xdr/transaction_result_view.go Adapts to decoded enum discriminant accessor (MustCode() now returns enum directly).
xdr/ledger_close_meta_view.go Uses decoded discriminants and swaps fixed-opaque Value() usage to Raw() for zero-copy ledger hashes.
network/transaction_view.go Updates envelope type read for decoded discriminant accessor.
ingest/transaction_view.go Refactors TxProcessing iteration and envelope matching to hash-first + Fields() projection.
ingest/transaction_events_view_test.go Adapts to TxProcessing parts projection (meta view already located/trimmed).
ingest/transaction_events_view.go Simplifies meta event extraction using decoded discriminant + Must-under-Try traversal and new iterator shapes.
ingest/ledger_close_meta_view_nav.go Replaces interface-based TxProcessing views with a concrete {Result, TxApplyProcessing} projection and Must-under-Try envelope walks.
ingest/extract.go Updates extractors to consume TxProcessing parts projection.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread xdr/view_randxdr_test.go
Comment thread xdrgen/view_support.go.txt
Comment thread xdrgen/view_emitters.go

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1d4f020aee

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread xdrgen/view_plan.go Outdated
tamirms and others added 2 commits June 15, 2026 17:14
…elds() for xdr views

Extends the shipped v1 xdr views API (stellar#5949) with four additive features,
without removing Iter()/All()/At()/Raw() or adding a Scan() cursor:

- Decoded union discriminants: the discriminant accessor (e.g. V()) now
  returns the decoded enum/int/bool directly instead of a leaf view, so
  callers drop the two-step V().Value(). Enum discriminants validate against
  the known case set; int discriminants do not (default arms stay reachable).

- Typed fixed-opaque Value(): returns the typed array (e.g. Hash, [N]byte)
  by value instead of []byte.

- Array count validation (OOM guard): Count()/All()/Iter() validate the wire
  element count against the remaining buffer up front, using a per-type
  minimum element wire size. size()/valid() keep the cheap unvalidated count
  because their per-element walk is already buffer-bounded, so the check is
  confined to the allocation/iteration entry points. Covered by
  view_count_validation_test.go.

- Fields(): a per-struct method that locates every field in a single walk
  and returns a bundle of trimmed sub-views, enabling fused advance+locate
  iteration (advance by len(bundle.View)) for single-walk materialization
  without a cursor. A randxdr differential (TestView_RandXDR_Fields) checks
  the located fields are trimmed to their exact wire extent.

The xdr-package view helpers are adjusted for the typed-opaque and decoded-
discriminant changes (LedgerCloseMetaView.LedgerHash/PreviousLedgerHash use
Raw() for zero-copy bytes; TransactionResultView.Successful uses the decoded
code), and views_api.md documents all four features.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Consumes the new xdr views features; public extractor signatures unchanged
(ExtractTxHashes, ExtractLedgerEvents, LedgerTransactionViewByHash/Range,
network.TransactionViewHasher).

- Project each TxProcessing element to a concrete {Result, TxApplyProcessing}
  by locating its fields in a single Fields() walk and advancing to the next
  element by reslicing past the located node (txProcessingPartsMeta /
  ...V1). The first element comes from the array's own At(0), so no array
  wire-offset is hardcoded in ingest. metaRaw is now a free trimmed slice
  (MetaRaw()) instead of a re-walk, recovering the single-walk materialization
  win without a cursor.

- Use decoded discriminants throughout, dropping the two-step V().Value().

- Collapse the envelope tree walk's bool-threaded yield functions into two
  iter.Seq2 functions, and the event walkers' per-accessor error ladders into
  Must/Try blocks.

- Pair TxSet envelopes by hash first, extracting an envelope's type/soroban/raw
  details (resolveEnvelope) only after it matches a wanted hash. On a by-hash
  lookup or small page the rest of the TxSet is only hashed, not detail-read
  (~9-10% faster on mid/late by-hash; full-ledger unchanged since every
  envelope is wanted). This narrows envelope validation from "every envelope in
  the TxSet" to "only paired envelopes" — for trusted LCM input these are
  equivalent.

Behavior-preserving for valid input: the real-ledger equivalence tests are
unchanged and pass. Materialization paths are ~1.25x faster and allocations
drop sharply (e.g. ExtractTxHashes 267 to 17 allocs/ledger); point-lookup
latency is unchanged or improved.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@tamirms

tamirms commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🚀

Reviewed commit: 7f4eed4b03

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@tamirms tamirms requested a review from a team June 15, 2026 15:33

@chowbao chowbao left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Might be good to have someone else on platform review too but I didn't see anything needing changes in my review pass

@tamirms tamirms merged commit 4b4bbfe into stellar:main Jun 16, 2026
16 of 19 checks passed
@tamirms tamirms deleted the views-minimal branch June 16, 2026 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants