engine: deterministic VDF slot-table decode + arrayed lookup-only descriptor rebind#605
Conversation
Review: deterministic VDF slot-table decode + arrayed lookup-only rebindThe slot-table determinization ( [P1] Name-key walkers not updated to skip stale entries — record→name resolution misaligns on edited files
[P3] Unbounded
|
2c2081a to
dd96f12
Compare
…ptors Two related VDF-reader corrections that close the last structural gaps the heuristics were papering over. Slot table (#470, #549). The slot-table under-count on edited fixtures was not a find_slot_table bug: parse_name_table_extended stopped at the first stale/deleted name-table entry (a deleted name's non-printable binary payload) instead of skipping it by its declared length, capping the slot count. Vensim's reader skips such entries -- docs/design/vdf.md already specified the skip, only the Rust code stopped. With the name table complete, the slot table is fully determined by the section header, so the backward-scan heuristic is removed: slot_table_from_header reads the start from section 1's field1 1-based word pointer and the count from block1[7], and cross-checks the 0x00430000 terminator and the name-table boundary. Verified on all 138 run-file and 6 dataset VDFs in the corpus. The block1[7] invariant is tightened from +/-2 to exact, with no fixture exemptions. Lookup-only descriptors (#597). A bare graphical function is a TABLE indexed by an explicit input, not a time series, so Vensim saves no series for it -- only a descriptor record whose f[11] is a section-6 lookup-record index. The reader now DROPS standalone lookup-only descriptors (like overlapping ones) instead of reconstructing a series at their f[11]-as-OT-start stock ghost block; their values, where they matter, are carried by the consumer variables that call them, emitted as ordinary owners under their own names. This shrinks the C-LEARN EXPECTED_VDF_RESIDUAL from 21 to 13: six lookup-only bases drop out of the comparison, and the nine that remain are lookup tables the model-free reader cannot safely distinguish from a real owner (the rs_hfc* family forward-links to a wider 2-D consumer; one scalar forward-links to Time/0), so it still emits a ghost column the comparator flags. The deeper bug is in the engine: lowering a bare lookup to gf(Time) synthesises a phantom series (a table is not generally a function of time; using Time as the index is a unit error). Fixing that removes the remaining nine from the matched set entirely.
dd96f12 to
8c7bdb9
Compare
Review: deterministic VDF slot-table decode + drop lookup-only descriptorsI reviewed the engine changes ( Verified
Non-blocking (P3): stale module doc
Overall correctness: correctExisting tests and behavior are preserved; the change is free of blocking issues. The doc drift above is documentation-only and does not affect correctness. |
Code review — PR #605 (deterministic VDF slot-table decode + drop lookup-only descriptors)I reviewed the slot-table determinism change, the name-table stale-entry skip, and the descriptor rebind→drop conversion for correctness and memory safety. Verified sound
Finding[P3] Stale module doc: Overall correctness verdictCorrect. The patch is memory-safe, logically consistent, and the behavioral changes (deterministic header-driven slot decode; dropping lookup-only descriptors) are intentional and verified by the updated unit/integration tests and |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #605 +/- ##
==========================================
- Coverage 82.87% 82.86% -0.01%
==========================================
Files 260 261 +1
Lines 69576 69813 +237
==========================================
+ Hits 57659 57850 +191
- Misses 11917 11963 +46 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
Two related VDF-reader corrections that close the last structural gaps the
heuristics were papering over. The format is now decoded deterministically (no
scan/stride heuristics), and lookup-only "variables" -- graphical-function
tables, not time series -- are no longer reconstructed.
Slot table (Fixes #470, Fixes #549)
The slot-table under-count on edited fixtures (
risk2.vdf,SCEN01.VDF, themetasd
social-network-valuationpair) was not afind_slot_tablebug.Root cause:
parse_name_table_extendedstopped at the first stale/deletedname-table entry (a deleted name's non-printable binary payload, e.g. risk2's
91 01 00 00 63 79 20 30) instead of skipping it by its declaredu16length,which capped the slot count via
max_name_count. Vensim's reader skips suchentries;
docs/design/vdf.mdalready specified the skip -- only the Rust codestopped.
With the name table complete, the slot table is fully determined by the section
header, so the backward-scan heuristic (largest run of unique, in-range,
4-byte-aligned offsets with
min_stride >= 4) is removed.slot_table_from_header:field11-based word pointerblock1[7]0x00430000terminator and the name-table boundaryThese three over-determine each other and were verified to agree on all 138
run-file and 6 dataset VDFs in the corpus. The
block1[7]structural invariantis tightened from a +/-2 diagnostic window to exact, with the
rust_slot_table_undercount_knownexemptions removed.Lookup-only descriptors (Fixes #597)
A graphical function is a table indexed by an explicit input (
y = lookup(input)); a bare lookup with no call site of its own is not a timeseries, so Vensim saves no data block for it -- only a descriptor record whose
f[11]is a section-6 lookup-record index. The reader now drops standalonelookup-only descriptors (exactly as it already drops overlapping descriptors)
instead of reconstructing a series at their
f[11]-as-OT-start stock-ghostblock. The table's values, where they matter, are carried by the consumer
variables that call it with a real input -- those are ordinary owners the reader
emits under their own names.
This shrinks the C-LEARN
EXPECTED_VDF_RESIDUALfrom 21 to 13 with zeroregressions (verified by
clearn_residual_exactness,grew == []): sixlookup-only bases (
historical_gdp_lookup,historical_forestry_lookup,rs_gdp_in_trillions,ozone_precursor_forcings,oc,_bc,_and_bio_aerosol_forcings,other_forcings_smooth_plus_rcp85) drop out of the comparison entirely. The ninethat remain are lookup tables the model-free reader cannot safely distinguish
from a real owner, so it still emits a ghost column the comparator flags:
rs_hfc*(8): the descriptor forward-links to the wider 2-D consumerRS HFC[COP, HFC type](forward width 63 != the descriptor's 7), so theconservative width gate declines to drop it.
ref_global_emissions_from_graph_lookup: its forward link is Time/0.The deeper bug is in the engine: lowering a bare lookup to
gf(Time)(#590)synthesises a phantom series -- a table is not generally a function of time, and
using
Timeas the index is a unit error (it only coincidentally matches inC-LEARN, where the tables' x-axis is calendar year and the consumers call them as
LOOKUP(Time / One year)). Fixing that removes the remaining nine from thematched set entirely; tracked as a separate engine follow-up.
Verification
vdf,vdf_multidim,vdf_alias_decoder,vdf_structural_invariants).clearn_residual_exactnessandsimulates_clearnpass in release (residual exact at 13, zero regressions).clippy -D warnings, workspace tests, WASM build, TypeScript, Python).