Close the C-LEARN simulation residual; add CLI GET DIRECT external-data support by bpowers · Pull Request #600 · bpowers/simlin

bpowers · 2026-05-20T22:33:25Z

Summary

Closes the C-LEARN model's simulation residual vs Vensim Ref.vdf as general engine/CLI primitives (no C-LEARN special-casing). Five phases:

Lookup-only variables (standalone graphical functions) now lower uniformly to gf(Time) across scalar / arrayed / apply-to-all, replacing the inconsistent gf(0)-vs-literal-0 lowering.
Shadowed builtin macros: a user :MACRO: RAMP FROM TO(...) is no longer linearized at import time and now resolves through the compile-time macro path, so its exponential branch runs.
Passthrough INITIAL macros (:MACRO: INIT(x) = INITIAL(x)) collapse to the LoadInitial opcode at the call site; plus a general dt-runlist chain-break fix for stock submodel outputs (salsa path now at parity with the legacy path).
Runlist determinism: fixed a pre-existing HashMap-iteration-order nondeterminism in the initials runlist, so the same model always produces the same runlist and output.
Residual attribution + gate: EXPECTED_VDF_RESIDUAL tightened from 32 to 21 bases, each with a sourced category (0 engine-genuine remain; the rest are VDF-reader decode artifacts, benign-near-zero, or :NA:-arithmetic boundaries). Both --ignored C-LEARN gates pass deterministically, with comparator tolerances/floors unchanged (guard-tested). Fixed the scalar VDF-reader decode of standalone graphical-function descriptor columns.
CLI external data: simlin-cli now resolves Vensim GET DIRECT * external data via a FilesystemDataProvider rooted at the model path (stdin uses the null provider).

The engine value is provably correct for every remaining residual base (the divergences are reference-side, in our Ref.vdf test-reader). Follow-ups filed: #595 (init-runlist soundness gap), #597 (arrayed VDF-reader decode), #598 (WASM data-supply API), tech-debt #64. #590 / #591 updated with final disposition.

Test plan

cargo test -p simlin-engine and cargo test -p simlin-cli (default suites) pass.
Release gates pass deterministically: cargo test -p simlin-engine --features file_io --release --test simulate -- --ignored simulates_clearn clearn_residual_exactness.
Human verification steps in docs/test-plans/2026-05-19-clearn-residual.md.

Adds the pure predicate `is_lookup_only(equation, has_tables)` plus unit tests in compiler/mod.rs. A lookup-only variable is a standalone graphical-function holder: it has a graphical function and its equation is empty or the MDL lookup sentinel (no functional input expression). The empty-equation arm is load-bearing: an XMILE-sourced lookup-only variable carries an empty/absent equation plus a gf, while an MDL-sourced one carries LOOKUP_SENTINEL plus a gf -- the sentinel string is MDL-format-internal, not a stable datamodel contract. WITH LOOKUP (tables + a real input expr) and an ordinary aux (no tables) both return false, mirroring mdl::writer::is_lookup_only_equation. The helper is unused in non-test code in this commit (carrying a transient allow(dead_code)); the following commit wires it into the scalar/arrayed/A2A lowering sites and removes the allow.

… semantics A standalone graphical-function ("lookup-only") variable -- one whose entire equation is an inline table with no functional argument -- was lowered inconsistently: the scalar path wrapped it as LOOKUP(self, 0) (a constant gf(0)), while the arrayed and apply-to-all hoisting paths emitted a literal 0 and never consulted the attached tables. None matched genuine Vensim. Detect lookup-only variables (the committed is_lookup_only atom, composed by var_is_lookup_only over the datamodel Equation shape) and lower all three shapes uniformly through the existing scalar Lookup opcode at the determined index. Phase 1 determined that index empirically to be the current simulation time -- gf(Time): applied consumers of C-LEARN's year-indexed lookup tables byte-match gf(Time) against Ref.vdf (with clamp-to-endpoint outside the x-domain), and the importer's MdlEquation::Implicit arm already lowers an unspecified-input gf to equation = "TIME". The index is centralized in lookup_only_index_expr so the three sites stay in lock-step. The arrayed case has per-element tables (tables.len() == n) and wraps each element's own offset off+i (codegen derives elem_off = i, table_count = n); the A2A case has one shared variable-level table (tables.len() == 1) and must wrap the BASE offset off (elem_off 0, table_count 1) -- wrapping off+i there would push the VM Lookup bounds check out of range for i>0 and return NaN. WITH LOOKUP (tables + a real input) keeps its LOOKUP(self, input) lowering; ordinary auxes and stocks are untouched. No new datamodel field, protobuf change, or VM/codegen primitive; the per-element table layout (build_tables / reorder_arrayed_element_tables) and the Lookup opcode are reused. The production salsa-incremental path reuses compiler::Var::new, so the fix applies to both the legacy and incremental paths. Adds C-LEARN-independent generality fixtures (lookup_only_tests.rs) for the scalar, arrayed (non-alphabetical declared order), A2A, and applied (LOOKUP(g, idx) with idx distinct from Time) cases. These were RED before this lowering -- the arrayed/A2A standalone cases produced [0,0,0] and the scalar standalone case produced a flat constant gf(0) [11,11,11] -- and are GREEN with it. Removes the transient allow(dead_code) on is_lookup_only now that production references it, and pins the regular Lookup opcode's out-of-range clamp / NaN-index behavior (unchanged by this fix, AC1.6) with a focused vm test.

…eview) The arrayed lookup-only lowering (`expand_arrayed_with_hoisting`) wrapped `Expr::Var(off + i)` unconditionally, which is only correct when the table layout is per-element. `variable.rs::build_tables` produces one table per element (`tables.len() == n`) ONLY when at least one element carries its own gf; an `Equation::Arrayed` whose element equations are all empty/sentinel with a single variable-level gf falls through to the shared-table branch (`tables.len() == 1`, the same shape A2A always has). In that case `codegen::extract_table_info` derives `elem_off = i` while `table_count == 1`, so the VM's `Lookup` bounds check pushes NaN for every element with i > 0. Make the offset depend on the actual table layout, mirroring how `build_tables` decides it: wrap `off + i` for the per-element layout and the BASE `off` for the single shared table (exactly like `expand_a2a_with_hoisting`). The two layouts are encoded as `Option<LookupOnlyLayout>` so the invalid "lookup-only but neither layout" state is unrepresentable and the caller derives the variant from `tables.len()`. No name/path special-casing; the existing `Lookup` opcode is reused. The MDL/XMILE importers never emit this shape (they attach per-element gfs to an arrayed lookup), but `datamodel::Project` is a public compile input that protobuf/JSON/serde/MCP/pysimlin/libsimlin all preserve verbatim, so a directly-constructed project can reach it. Pinned by a new fixture (`arrayed_lookup_only_variable_level_gf_shares_one_table_at_time`): every element now reads the one shared table at gf(Time) instead of NaN for i > 0.

The MDL->datamodel expression formatter (`xmile_compat.rs`) had a `format_call_ctx` arm that rewrote any 4+-arg `RAMP FROM TO(...)` into a hardcoded linear `(xfrom) + RAMP(slope, tstart, tend)` string at import time, discarding the 5th `islinear` selector and the entire macro body, *before* compile-time macro resolution runs. `RAMP FROM TO` is not a native builtin, so a model that uses it must define a same-named `:MACRO:` -- which the macro-shadows-everything precedence in `builtins_visitor.rs` resolves correctly. The import-time rewrite silently pre-empted that path, so a macro selecting its exponential branch (`islinear = 0`) always ran the linear ramp instead. Deleting the arm lets the call survive import as `RAMP_FROM_TO(...)`, which canonicalizes to `ramp_from_to` and resolves through the existing macro path. No new builtin/opcode/codegen primitive is added. Line 555's `"ramp from to" => "RAMP_FROM_TO"` name mapping is intentionally kept -- it is the active formatting path once the arm is gone (redundant with the catch-all, but self-documenting). Covers clearn-residual.AC2.1/AC2.2/AC2.3/AC2.4. The new discriminating test `macro_ramp_from_to_runs_selected_branch` (exp branch uses double slope so it is provably distinct from the linear branch mid-ramp) and the inverted formatter test `test_ramp_from_to_survives_as_macro_call` were both RED before the arm deletion and are GREEN after. The pre-existing no-regression guard `simulates_macro_clearn_ramp_from_to_mdl` (`islinear = 1`, same linear series via the macro path) still passes. Evidence (the ignored, release-only `clearn_residual_exactness` gate, NOT pruned here -- Phase 4 reconciles `EXPECTED_VDF_RESIDUAL`): five excluded bases no longer fail (`shrank`): im_3_emissions, im_3_emissions_vs_rs, im_3_ff_co2, relative_emissions_to_equity, relative_emissions_to_equity_target. Two bases newly enter the residual (rs_co2_ff, rs_gdp_in_trillions) -- Phase 4 re-measures and reconciles the gate after Phases 1-3.

Record the audit conclusion above `format_call_ctx`'s match: arms that *restructure* a builtin-named call into a different expression at import time pre-empt any same-named user `:MACRO:`, because the rewrite runs before compile-time macro resolution. The `ramp from to` arm was the one C-LEARN tripped (now removed); the other restructuring arms (`sample if true`, `pulse`, `pulse train`, `modulo`, `zidz`, `xidz`, `get data between times`, etc.) carry the same latent hazard but are not currently shadowed by any in-repo model -- if a future model defines such a macro, the arm must be guarded the same way. Name-preserving renames are safe and need no change: `SSHAPE` is handled only by `format_function_name` (a genuine 3-arg builtin, not restructured here), so a same-named macro shadows it cleanly. The existing `ac5_4_macro_shadows_ramp_from_to_builtin` and `ac5_4_macro_shadows_sshape_builtin` tests already pin both the multi-word and single-word shadowing cases, so no new test is added. Documentation only; no runtime behavior change. Covers clearn-residual.AC2.5.

Add the pure structural classifier classify_passthrough and its PassthroughBuiltin result type. A genuine passthrough macro is a single-parameter, single-output macro whose primary-output body is exactly out = BUILTIN(param) where BUILTIN canonicalizes to the same renamed-builtin-collision name as the macro itself -- the self-call shape the MDL importer's INITIAL -> INIT rename produces for :MACRO: INIT(x) = INITIAL(x). This is the Functional-Core decision a later commit uses to collapse such a macro directly to its proven opcode (LoadInitial) at the call site, bypassing the buggy per-element synthetic module. The strict structural match (single param; no additional outputs; body exactly a single one-argument call; the argument is the bare parameter; the call canonicalizes to the macro name; that name is a renamed-builtin collision) cannot misfire on a non-passthrough macro that merely shares a builtin name. The items carry a temporary allow(dead_code) until the next commit threads them onto ModuleFunctionDescriptor.

Thread the passthrough classification onto ModuleFunctionDescriptor as a new passthrough: Option<PassthroughBuiltin> field, populated in MacroRegistry::build via the new classify_macro_passthrough bridge: it locates the primary-output body variable, parses its single scalar equation (the only place each macro body is parsed for this), and calls the pure classify_passthrough. Stdlib descriptors and non-passthrough macros -- including Phase 2's RAMP FROM TO, which is not a passthrough -- get None. The field participates in PartialEq/Eq (PassthroughBuiltin derives them) so salsa invalidation of project_macro_registry stays correct, and the salsa::Update derivation is preserved. This commit adds the field and its classification only; the call site does not consume it yet, so there is no runtime behavior change.

…the call site A user macro :MACRO: INIT(x) = INITIAL(x) collides (after the importer renames Vensim INITIAL -> INIT) with the renamed builtin and, under macro-shadows-everything precedence, expands every INITIAL call into a buggy per-element synthetic module whose value is mis-ordered / mis-propagated for an element-wise recurrence (#591-c1). The call site now collapses a GENUINE passthrough macro (descriptor.passthrough set by the registry classifier) directly to the proven opcode by skipping expand_module_function and falling through to the existing renamed-builtin intrinsic routing -- init -> LoadInitial with the existing make_temp_arg hoisting. This generalizes the #554 self-call exception from inside the macro body to the call site. The self-call invariant the classifier guarantees (canonicalize(call) == canonicalize(macro_name) AND is_renamed_builtin_macro_collision) makes the fall-through route func to the right intrinsic. equation_is_module_call gets the same treatment: a passthrough macro is no longer pre-classified as a module call, so collect_module_idents agrees with the call site (the variable is the scalar opcode it collapses to, not a multi-slot module). This is the same module-vs-opcode decision made consistently at both layers -- a passthrough INIT macro now behaves exactly like the bare INIT builtin it collapses to (which is likewise not a module call), the way the proven bare-INITIAL recurrence already does. No NA-arithmetic change (float.rs::NA untouched). The two fixtures added here were RED before this collapse and are GREEN after: the element-wise macro_init_recurrence fixture (helper_recurrence + the :MACRO: INIT block) dropped to ecc=0 at t>=1 via the synthetic module and now holds 1/2/4 across all saved steps (AC3.2); the inline scalar capture holds INITIAL TIME constant (AC3.1). The bare-INITIAL helper_recurrence path is unchanged (AC3.3).

A non-module variable that reads a synthetic module's STOCK output (e.g. `v = SMOOTH(input, dt)` -> `v` reads the SMOOTH's INTEG `output`) kept a spurious `v -> module` dt dependency in the salsa runlist: `normalize_deps` collapses the `module·output` reference to the bare module name, and unlike the legacy `model.rs::module_output_deps` path (which omits a stock output's module dep in the dt phase) `build_var_info` only applied that stock-output chain-break to MODULE readers, not ordinary ones. Because a module is a sink in the cycle relation (`dt_walk_successors`) but carries its input `src` as a direct dep in `dt_dependencies`, the spurious `reader -> module` edge closes an ordering cycle that cycle DETECTION never sees. `topo_sort_str` then broke that cycle arbitrarily (HashMap-iteration dependent), sometimes emitting the SMOOTH module BEFORE its input helper. The mis-ordered element read a stale (0) input every flows step and decayed to 0 at t>=1 while matching at t0 -- the C-LEARN `emissions_with_stopped_growth` (SAMPLE IF TRUE -> PREVIOUS-self + SMOOTH whose input feeds back through the emissions chain) #591-c1 residual, and a latent nondeterministic runlist for any such stock-output feedback. The fix moves the existing stock-submodel-output filter so it applies to every dt reader (mirroring the legacy gate), and resolves the (synthesized, implicit) module-instance name to its model to test the output's kind. Only `dt_deps` is filtered; the init phase keeps the edge (a stock's init value is computed during init, so it is a real init dependency). Adds an inline-MDL regression asserting every element of an arrayed SAMPLE-IF-TRUE+SMOOTH with a dt-only input feedback holds its smoothed constant at every saved step.

…stdoc) - simulate.rs (Minor 1): correct the scalar AC3.1 test's doc comment. It is NOT a value RED->GREEN discriminator for the Task 4 synthetic-module collapse (the scalar non-recurrence INITIAL-capture is handled correctly by the synthetic-module path; the implementation-time "RED" was a NotSimulatable compile inconsistency, not a wrong value). The comment now states it asserts the AC3.1 held-constant invariant plus the compile-consistency the model.rs passthrough-exclusion enables, and points to the element-wise test as the genuine value discriminator. Assertions are unchanged. - module_functions.rs (Minor 2): add classify_passthrough gate-5 unit test (single-arg previous(x) in an init-named macro -> None) so the self-call name check is exercised in isolation; the existing different-call-name test uses a TWO-arg body and is rejected earlier at the single-arg gate. - model.rs (Minor 3): add the passthrough-exclusion caveat to the equation_is_module_call rustdoc so it aligns with the inline comment -- a genuine passthrough macro is not a module call here and behaves like the bare builtin it collapses to.

The Phase 3 call-site collapse (#591-c1) added a special case to equation_is_module_call: a genuine passthrough macro was excluded from the module-call classification, with a comment claiming that otherwise collect_module_idents would mark the variable module-backed while the walk emits a scalar opcode, so the fragments would fail to compile. Investigation falsifies that claim. The module_idents set this classification feeds is consumed in builtins_visitor only via is_module_backed_ident, which gates whether a *referencing* PREVIOUS/INIT synthesizes a scalar temp arg. A passthrough macro collapses to a plain flat-slot variable, so that temp-arg copy is value-identical to reading the slot directly -- the classification changes no observable result. Toggling the exclusion (active vs neutralized) on a scalar and an arrayed fixture where a passthrough-collapsed value is referenced inside a downstream PREVIOUS and INITIAL produced byte-identical series in both states, and the full default + file_io simulate suites stay green either way. No shape exercises the exclusion. Per "simple, general, testable code is better," remove the exclusion so a passthrough caller is classified like any other resolved macro, and rewrite the model.rs comment and the AC3.1 scalar-test doc to state only verifiable facts: the scalar test asserts the held-constant invariant and is not the value RED/GREEN discriminator for the collapse (the synthetic-module path already produces the constant for the scalar case); the element-wise recurrence test remains the genuine discriminator (it drops to 0 with the collapse neutralized). No NA-arithmetic change, no C-LEARN special-casing, EXPECTED_VDF_RESIDUAL untouched.

A lookup-only variable that Vensim saves *only* as a graphical-function descriptor record (no separate consumer-owner record) has its `f[11]` holding a section-6 lookup-record index, not an OT-block start. When that index value reinterpreted as an OT-start lands in `[1, OT_count)` on a slot with an owner class code, `decoded_record_spans` emits a span for it; and because `identify_descriptor_records` only peels descriptors that sit in an *overlapping* OT component, a standalone descriptor (which collides with nothing) slips through as an owner and decodes at that spurious `f[11]`-as-OT-start ghost slot -- a class-0x08 stock slot holding 0/garbage. The real series is the forward-linked evaluated-output OT `lookup_record[f[11]].word[10]` (proven byte-exact vs the engine for "Global Emissions from graph LOOKUP" -> OT 1612). The fix adds a pure functional-core `standalone_descriptor_rebinds` that recognises such a record and re-binds it to its forward OT, gated to avoid disturbing legitimate owners: the span is non-overlapping; `f[11]` is a valid lookup-record index; the span is scalar (an arrayed descriptor re-bound to one forward OT would scalarize and lose its element columns -- deferred); the `f[11]`-as-OT-start slot carries the STOCK class code (a graphical-function variable is never a stock, so a stock slot is the spurious-owner telltale -- a legitimate scalar owner whose `f[11]` is coincidentally a small lookup index carries a non-stock 0x11 code and is left untouched); and the forward link is a valid data OT (never Time/0). `build_record_result_columns` emits the re-binds as scalar columns at the forward OT, sharing it with any real consumer-owner (Vensim stores one evaluated series consumed under several names) rather than claiming it. On C-LEARN this reconciles exactly two scalar lookup-only bases that decode byte-exact vs the engine -- `global_emissions_from_graph_lookup` and `other_forcings_composite_plus_rcp85` -- moving them into the `clearn_residual_exactness` `shrank` set. The other named scalar lookup-only bases (`oc,_bc,_and_bio_aerosol_forcings`, `other_forcings_smooth_plus_rcp85`, `ozone_precursor_forcings`, `ref_global_emissions_from_graph_lookup`) are NOT reconciled: their forward links point at a foreign/shared consumer OT or at Time, which differs from the engine's lookup-only output -- Vensim stores no distinct column for them, a genuine residual. The arrayed lookup-only subset (`rs_*`, `historical_*_lookup`) is explicitly deferred (the scalar gate leaves their decode unchanged) because it needs element-order info the VDF format does not store on disk; a follow-up issue will track it. `EXPECTED_VDF_RESIDUAL` is untouched here (that pruning is the gate-tightening task), so `clearn_residual_exactness` still fails overall with the two newly-reconciled bases reported in `shrank`. The existing VDF reader corpus tests (`vdf`, `vdf_alias_decoder`, `vdf_multidim`) stay green. Covered by a non-ignored synthetic regression test on a minimal record set independent of any C-LEARN name.

The Initials runlist was a function of the per-process HashMap RandomState seed, not of the model: `model_dependency_graph_impl` materialized its candidate set (`init_list`) from a `HashSet` and handed it straight to `topo_sort_str`, which emits `names` in visit order and breaks ties -- variables with no ordering dependency between them, e.g. independent constants, stocks, and the lagged-input-stripped `PREVIOUS()`/`INITIAL()` helpers -- by exactly that order. Two compiles of the SAME model therefore produced different init orderings, and for any unordered pair that feeds the init value of a `PREVIOUS()`/`INITIAL()` variable, different (one possibly wrong) initial values. The Flows and Stocks runlists were already deterministic because they filter the pre-sorted `var_names`; only the initials phase leaked hash order. Fix: sort `init_list` before `topo_sort_str`, matching the flows/stocks contract. With a sorted `names` argument and BTreeSet-stored dependency edges, `topo_sort_str`'s entire traversal -- including how it breaks any residual ordering cycle -- is a deterministic function of the model. This is general (deterministic simulation for all models), not C-LEARN specific, and changes nothing about NA-arithmetic or tolerances. Pre-existing, not introduced by this branch: the `init_set.into_iter().collect()` pattern dates to the original salsa incremental-compilation work (PR #289), long before the C-LEARN residual branch. It only became observable now because C-LEARN's residual cells sit near the 1% cross-simulator tolerance, where a tiny order-induced floating-point delta flips a verdict. Empirically: across 12 fresh-DB compiles of C-LEARN the initials runlist had 12 distinct orderings before this change and 1 after (flows/stocks were 1 both times), and the live residual set is now byte-identical run to run. Relation to GH #595: this resolves the HashMap-iteration-order nondeterminism #595 identifies as the proximate mechanism (the arbitrary, hash-dependent init tie-break is now a deterministic function of the model). It does NOT close #595's deeper soundness gap -- the detection-vs-ordering edge-set inconsistency that lets an init ordering cycle through a synthetic-module stock output exist undetected; such a cycle is now ordered deterministically rather than reported. No in-repo model triggers that shape today. A regression test pins the property directly (no probability): `initials_runlist_is_deterministic_across_fresh_databases` asserts a byte-identical runlist across 32 freshly-seeded databases, and `initials_runlist_is_sorted_topological_order` asserts the order equals the stable (sorted-name tie-break) topological sort reconstructed from the engine's own `initial_dependencies`. Both fail if the sort is removed.

Phases 1-3 plus the init-runlist-determinism fix (e24b008) reconciled the bulk of the C-LEARN-vs-Ref.vdf residual. Re-measuring the live failing set through the unchanged classify_vdf_ident comparator (run twice, byte-identical: 3482 idents matched; 21 live residual bases) shrinks EXPECTED_VDF_RESIDUAL from 32 base variables to exactly 21, each now carrying a one-line sourced reason under a five-category taxonomy (no "unknown"): - engine-genuine-tracked (0): NONE. The only engine-genuine divergence, init-runlist nondeterminism (which permuted depth_at_bottom's per-layer init values), was fixed in e24b008; #595's deeper init-ordering soundness gap remains tracked. - VDF-decode-artifact (17): engine output proven correct (gf(Time) matches the model tables; applied consumers match Ref.vdf exactly); our Ref.vdf reader mis-decodes the standalone graphical-function descriptor columns. Scalar half fixed in d69754b; the arrayed element-order half (13 bases) and the 4 scalar no-distinct-column cases are tracked in #597. - benign-near-zero (2): co2eq_gap_closing_percentage (ratio of near-equal small values, peak ~1.1e-2) and diffusion_flux (~2% on a small early transient, matches late) -- cross-simulator noise on near-zero magnitudes. - NaN-vs-:NA: (0): NONE by construction. Where Vensim writes literal IEEE NaN and Simlin writes the finite :NA: sentinel (slr_inches_from_2000), the comparator NaN-skips those cells and the finite tail matches, so the series is neither all-NaN nor failing and never enters the failure set. - boundary (2): historical_gdp and last_set_target_year -- match everywhere except :NA:-arithmetic cells (Vensim -2*NA = -1.298e33 vs Simlin's bare sentinel -6.49e32). NA-arithmetic is confirmed correct and out of scope. The comparator's five constants (VDF_RTOL, K_ATOL, MIN_MATCHED_FRACTION, MIN_MATCHED_ABSOLUTE, MAX_NAN_SKIPPED_FRACTION) and the matched-floor logic are byte-identical to before -- this is a membership tightening of the documented exclusion, never a tolerance loosening. Two new non-ignored guard tests pin that, on synthetic inputs independent of C-LEARN: - vdf_comparator_constants_and_floor_are_pinned: pins the five constants and proves a below-floor matched count still PANICS (the gate, which checks the floor AFTER exclusion, cannot pass vacuously). - classify_vdf_ident_nan_vs_na_skips_without_failing: proves a NaN-vs-:NA: series yields nan_skipped cells and zero failures, so it never enters the residual set. Both --ignored gates pass deterministically: simulates_clearn (3360 matched after exclusion vs a 348 floor, ~9.7x, no tolerance violations) and clearn_residual_exactness (live set == the 21-base list, no grew/no shrank, identical across two runs). Issue dispositions posted via gh: - #590: the engine 0+0 graph-lookup zeroing is FIXED (Phase 1, lookup-only lowers to gf(Time)); the residual is now a VDF-reader decode artifact (engine proven correct), scalar half fixed (d69754b), arrayed half tracked in #597. Recommended closing #590 (left for a human to confirm). - #591: per-cluster final disposition -- RAMP FROM TO (Phase 2), passthrough INIT + dt-runlist ordering (Phase 3), and runlist determinism (e24b008) reconciled the bulk; the remainder is benign-near-zero (2) plus the :NA:-arithmetic boundary (2, documented genuine remainder). - #595: the nondeterminism half is fixed by e24b008 (deterministic init runlist); the deeper init-ordering soundness gap remains.

Address three Phase 4 review items; all are test-quality fixes, no production logic changes. - VDF standalone-descriptor rebind: add a negative unit test (legit_dynamic_owner_blocked_only_by_stock_slot_guard) where the STOCK-slot guard is the SOLE condition rejecting the re-bind -- the span is non-overlapping, scalar, has a valid f[11] lookup index, and its forward link resolves to a valid in-range owner OT, so guards 1/2/3/5 all pass and only the 0x08 STOCK requirement at span.start rejects. The pre-existing legit_dynamic_owner test had the forward link out of range, so guard 5 rejected first and the STOCK guard was never load-bearing. Mutation-verified: removing guard 4 flips the new test to a non-empty re-bind ({0: 2}) while all other tests stay green. - initials_runlist_is_sorted_topological_order: document that the DFS post-order reference reconstruction assumes an acyclic initials runlist (no resolved init SCC); a resolved SCC would be emitted SCC-contiguous by topo_sort_str's condensation and diverge. - classify_vdf_ident_nan_vs_na_skips_without_failing: clarify the test covers only the partial-NaN / finite-tail case, and make the boundary executable with an assertion that an entirely-NaN VDF series flips all_nan true and WOULD enter the residual set (the separately-guarded exception).

The CLI's Vensim open path called the plain open_vensim(contents), which uses the null DataProvider, so any model with a GET DIRECT DATA/CONSTANTS/ LOOKUPS/SUBSCRIPT reference failed to open from simlin-cli with "external data file '...' referenced but no DataProvider configured" -- the engine treats a missing provider as a hard error, not a silent zero. Extract the InputFormat::Vensim arm body into open_vensim_model(path, contents). When the model is a real file on disk, build a FilesystemDataProvider rooted at the file's parent directory and call open_vensim_with_data, mirroring the engine's simulate_mdl_path_with_data test helper; relative references like 'data/a.csv' then resolve to a sibling of the model file (Vensim's relative-to-model semantics). stdin (/dev/stdin, a character device) and any non-regular-file path fall back to the null provider, so /dev is never used as a data root. The extraction landed first keeping the null-provider behavior, so the data-driven test failed RED on behavior ("no DataProvider configured") before the provider was wired, then passed GREEN after. AC5.1/AC5.2: a #[cfg(test)] module exercises the directconst fixture (GET DIRECT CONSTANTS reading data/a.csv cell B2 = 2050) through open_vensim_model and the incremental-salsa run path, asserting the data-driven value (not zero/NaN). AC5.3: a model referencing a missing CSV yields a clear file-level diagnostic ("cannot resolve data file '...'") rather than a silent zero. WASM/libsimlin default paths are untouched (still on the null provider; libsimlin already exposes simlin_project_open_vensim_with_data for the data case).

…data open_vensim_model previously decided whether to build a FilesystemDataProvider via path.is_file() on the input path, which for stdin is the synthesized /dev/stdin sentinel. That guard holds for a TTY or pipe (is_file() == false -> null provider), but is wrong when stdin is redirected from a regular file: `simlin simulate < model.mdl` makes /dev/stdin resolve to a regular file, so is_file() == true and a provider gets built rooted at /dev/stdin's parent (/dev). The rustdoc and inline comment claimed "no provider is built" for stdin, which the code did not actually guarantee for the `< file` case. Key the decision on the user's intent -- whether a real model path was passed on the command line -- rather than filesystem metadata of the sentinel. open_vensim_model now takes Option<&Path>: Some(model_path) builds a provider rooted at the model's parent (bare filename -> CWD, as before), None (stdin, whether a pipe or a `< file` redirection) uses the null provider. The call site threads input.path (genuinely None for stdin) straight through. Behavior for real path arguments is unchanged. A unit test pins the stdin branch: open_vensim_model(None, ..) on a GET DIRECT model's contents must surface the "no DataProvider configured" error, proving no provider is built for stdin. The doc now states only what the code guarantees.

…debt #64)

The C-LEARN residual work (#590/#591) landed five general import/simulation primitives whose contracts the per-component CLAUDE.md files did not yet reflect: - engine: standalone lookup-only variables lower uniformly to gf(Time) (LookupOnlyLayout PerElement/Shared); genuine passthrough macros collapse to the builtin opcode (classify_passthrough + PassthroughBuiltin); the initials runlist is now deterministic (sort before topo_sort_str, #595) and the dt stock-submodel-output chain-break applies to all readers; xmile_compat carries a macro-shadowing audit (RAMP FROM TO arm removed); the VDF reader re-binds standalone graphical-function descriptors. - cli: GET DIRECT external data resolves via a FilesystemDataProvider rooted at the model path (stdin uses the null provider). Bumps the engine freshness lead to 2026-05-20 and adds a freshness date plus the lookup_only_tests.rs entry. Docs-only.

claude · 2026-05-20T23:14:29Z

Code review — no blocking findings

Reviewed all production changes (CLI external-data resolution, lookup-only lowering, passthrough-macro collapse, dep-graph determinism + dt chain-break, RAMP FROM TO macro-shadowing, and the VDF standalone-descriptor rebind). I traced each change against the surrounding source and could not identify a discrete, introduced bug that meaningfully impacts correctness.

Notes from the verification I ran (all checked out as correct):

Lookup-only lowering (compiler/mod.rs): the scalar/A2A/arrayed offset logic is sound — Shared wraps base off (element_offset == 0, table_count == 1) and PerElement wraps off + i (matches build_tables' row-major declared-index layout). Confirmed is_lookup_only fires on the MDL Lookup sentinel ("0+0") while Implicit ("TIME") already reaches gf(Time) via the normal path, so the fix targets exactly the case that was wrong. WITH LOOKUP correctly stays non-lookup-only.
dt stock-submodel-output chain-break (db_dep_graph.rs): extending keep_dt_dep to all readers (not just module-kind) drops only submodel·subvar edges where subvar is a Stock, matching the legacy model.rs::module_output_deps gate; stock outputs are prior-timestep reads, so removing the same-step edge is correct. Init phase keeps the edge. dt_deps/VariableDeps/ImplicitVarDeps types (BTreeSet<String>) align.
Deterministic init runlist: sorting init_list before topo_sort_str is a pure determinism improvement.
VDF rebinds (record_results.rs): descriptor_indices and rebinds are consistently keyed by rec_idx (matching the owner-emit filter at the consumption site); emit_descriptor_rebinds correctly avoids shadowing existing owners and dedupes by lowest rec_idx.
RAMP FROM TO arm removal (xmile_compat.rs): an intentional, documented change that matches genuine Vensim semantics (the function is macro-defined, not native), with the test updated accordingly.

One minor non-blocking observation: the empty-equation arm of is_lookup_only (documented as "load-bearing") is unreachable in the scalar/A2A lowering path because an empty equation yields ast == None, which hits the EmptyEquation early-return before var_is_lookup_only is consulted. This is defensive rather than functional — no MDL import produces empty-equation + tables (they get "0+0" or "TIME"), and it is not a regression — so it does not affect any real input.

Overall correctness verdict: correct. The patch is well-tested, internally consistent, and does not appear to break existing code or behavior.

codecov · 2026-05-21T00:30:48Z

Codecov Report

❌ Patch coverage is 93.92097% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.86%. Comparing base (37f9d6b) to head (666bd0c).
⚠️ Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
src/simlin-engine/tests/simulate.rs	84.37%	15 Missing ⚠️
src/simlin-cli/src/main.rs	80.00%	2 Missing ⚠️
src/simlin-engine/src/compiler/mod.rs	98.21%	1 Missing ⚠️
src/simlin-engine/src/module_functions.rs	97.05%	1 Missing ⚠️
src/simlin-engine/src/vdf/record_results.rs	97.50%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #600      +/-   ##
==========================================
+ Coverage   82.79%   82.86%   +0.06%     
==========================================
  Files         260      260              
  Lines       69286    69576     +290     
==========================================
+ Hits        57367    57653     +286     
- Misses      11919    11923       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

bpowers added 21 commits May 20, 2026 15:58

docs: add C-LEARN residual implementation plan

85d90e0

docs: track doubled ImportError wrapping in CLI error messages (tech-…

544f397

…debt #64)

docs: add test plan for C-LEARN residual closure

666bd0c

bpowers force-pushed the clearn-residual branch from cb3bbb0 to 666bd0c Compare May 20, 2026 23:05

bpowers merged commit 59e48a9 into main May 20, 2026
13 checks passed

bpowers deleted the clearn-residual branch May 20, 2026 23:10

bpowers mentioned this pull request May 21, 2026

engine: standalone lookup-only variable is lowered to gf(Time) (unit error; phantom Time-indexed series for a bare graphical function) #606

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Close the C-LEARN simulation residual; add CLI GET DIRECT external-data support#600

Close the C-LEARN simulation residual; add CLI GET DIRECT external-data support#600
bpowers merged 21 commits into
mainfrom
clearn-residual

bpowers commented May 20, 2026

Uh oh!

Uh oh!

claude Bot commented May 20, 2026

Uh oh!

codecov Bot commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bpowers commented May 20, 2026

Summary

Test plan

Uh oh!

Uh oh!

claude Bot commented May 20, 2026

Code review — no blocking findings

Uh oh!

codecov Bot commented May 21, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant