From bd2bdd453415be42a2c57461b9b0a44333b88293 Mon Sep 17 00:00:00 2001 From: Bobby Powers Date: Thu, 21 May 2026 08:59:30 -0700 Subject: [PATCH] engine: lookup-only tables produce no series (#606) A standalone Vensim lookup -- a graphical function with no driving input -- is a table indexed by an explicit input, not a time series. Lowering it to gf(Time) (the #590 import fix) synthesised a phantom Time-indexed series: a unit error in general, and a value genuine Vensim never saves. Such a table is now a first-class non-value-bearing static table (Variable::Var::is_table_only / db::source_var_is_table_only): excluded from the runlist and the saved output, its data reached only through LOOKUP(table, x) call sites that resolve the table by ident. A lookup-table reference is a layout reference, not a data-flow dependency, so a new BuiltinContents::LookupTable walk variant keeps it off the dependency/causal/LTM graphs while referenced_tables on VariableDeps/ImplicitVarDeps re-supplies the fragment compiler's metadata and tables map. A bare reference to a table (no argument) is now a compile error (LookupReferencedWithoutArgument); an empty equation paired with a graphical function is no longer flagged as EmptyEquation. The MDL importer emits the canonical empty-equation form instead of the "0+0" sentinel, still accepted on read for already-serialized models (full removal tracked in #608). The dead gf(Time) lowering is removed; the C-LEARN EXPECTED_VDF_RESIDUAL carve-out shrinks from 13 to 4. --- src/libsimlin/src/lib.rs | 5 + src/simlin-engine/CLAUDE.md | 8 +- src/simlin-engine/src/ast/expr2.rs | 3 +- src/simlin-engine/src/ast/mod.rs | 5 +- src/simlin-engine/src/builtins.rs | 11 +- src/simlin-engine/src/common.rs | 6 + src/simlin-engine/src/compiler/mod.rs | 281 +---------- src/simlin-engine/src/db.rs | 114 +++-- src/simlin-engine/src/db_dep_graph.rs | 20 +- src/simlin-engine/src/db_dep_graph_tests.rs | 2 + src/simlin-engine/src/db_implicit_deps.rs | 12 + src/simlin-engine/src/db_ltm_ir.rs | 4 + src/simlin-engine/src/db_var_fragment.rs | 50 +- src/simlin-engine/src/lookup_only_tests.rs | 459 ++++++++++-------- .../src/mdl/convert/variables.rs | 29 +- src/simlin-engine/src/mdl/mod.rs | 15 +- src/simlin-engine/src/mdl/writer.rs | 18 +- src/simlin-engine/src/variable.rs | 159 +++++- src/simlin-engine/tests/simulate.rs | 62 +-- 19 files changed, 702 insertions(+), 561 deletions(-) diff --git a/src/libsimlin/src/lib.rs b/src/libsimlin/src/lib.rs index 11eaa6516..ca8f6897e 100644 --- a/src/libsimlin/src/lib.rs +++ b/src/libsimlin/src/lib.rs @@ -243,6 +243,11 @@ impl From for SimlinErrorCode { engine::ErrorCode::DimensionInScalarContext => SimlinErrorCode::Generic, engine::ErrorCode::BadOverride => SimlinErrorCode::BadOverride, engine::ErrorCode::UnsupportedForSerialization => SimlinErrorCode::Generic, + // A bare reference to a lookup table (used as a value without an + // argument) is, at the FFI granularity, a generic model error; the + // precise distinction is preserved in the engine-level `ErrorCode` + // and the error's `details` message (issue #606). + engine::ErrorCode::LookupReferencedWithoutArgument => SimlinErrorCode::Generic, } } } diff --git a/src/simlin-engine/CLAUDE.md b/src/simlin-engine/CLAUDE.md index e7671c88e..46c54dc6d 100644 --- a/src/simlin-engine/CLAUDE.md +++ b/src/simlin-engine/CLAUDE.md @@ -2,7 +2,7 @@ Core simulation engine for system dynamics models. Compiles, type-checks, unit-checks, and simulates SD models. See the root `CLAUDE.md` for full development guidelines; this file maps where functionality lives. -**Last updated: 2026-05-20 (C-LEARN residual closure (#590/#591), as five general Vensim import/simulation primitives, not model-specific patches: (1) a standalone lookup-only variable -- a graphical-function holder with no functional input -- lowers uniformly to `gf(Time)` across scalar/A2A/arrayed shapes (`src/compiler/mod.rs`: `var_is_lookup_only`/`is_lookup_only`/`lookup_only_index_expr` + `LookupOnlyLayout { PerElement, Shared }`); (2) a genuine passthrough macro `:MACRO: INIT(x) = INITIAL(x)` collapses to the builtin opcode at the call site (`module_functions.rs`: `classify_passthrough` + a `passthrough: Option` field on `ModuleFunctionDescriptor`, classified at `MacroRegistry::build`), instead of expanding a buggy per-element synthetic module; (3) the import-time XMILE formatter no longer linearizes a shadowed `RAMP FROM TO` macro -- `xmile_compat.rs::format_call_ctx` carries a macro-shadowing audit and the `ramp from to` restructuring arm was removed so the call survives as `RAMP_FROM_TO(...)` and resolves through the macro path; (4) the simulation **initials runlist is now deterministic** (`db_dep_graph.rs` sorts the init set before `topo_sort_str`, GH #595) and the dt stock-submodel-output chain-break applies to ALL readers (parity with the legacy `model.rs::module_output_deps` gate); (5) the VDF reader re-binds a **standalone** graphical-function descriptor to its forward-link output OT (`record_results.rs::standalone_descriptor_rebinds`). New `#[cfg(test)] lookup_only_tests.rs`; the C-LEARN `EXPECTED_VDF_RESIDUAL` carve-out (`tests/simulate.rs`) is now an exact, taxonomy-attributed remainder pinned by `clearn_residual_exactness`. simlin-cli now resolves `GET DIRECT *` external data via a `FilesystemDataProvider` -- see its CLAUDE.md.) Earlier: Element-level cycle resolution + genuine-Vensim VECTOR ELM MAP/SORT ORDER + `:NA:` sentinel, the work that made C-LEARN compile via the incremental path, run to FINAL TIME, and match genuine Vensim (`Ref.vdf`) within the 1% cross-simulator tolerance on the matched floor. The whole-variable `model_dependency_graph` cycle gate now refines a recurrence SCC to an element-acyclic verdict over a cross-member-comparable symbolic `SymVarRef` element graph (`db_dep_graph.rs`: `resolve_recurrence_sccs`/`refine_scc_to_element_verdict`/`symbolic_phase_element_order`, GH #575); a resolved SCC's per-element symbolic segments are interleaved into one combined fragment along the SCC's `element_order` and injected at `assemble_module` (`db.rs`: `combine_scc_fragment`/`var_phase_symbolic_fragment_prod`). Per-variable lowering moved to a new sibling module `db_var_fragment.rs` (`lower_var_fragment`). `crate::float::NA` is the finite Vensim `:NA:` sentinel (`-2^109`, NOT IEEE NaN); both `:NA:` paths route to it. New top-level VM-adjacent modules `vm_vector_sort_order.rs` (arrayed VECTOR SORT ORDER, per-iterated-slice 0-based ranks, #585) and `vm_vector_elm_map.rs` (base+full-source, OOB→NaN, no modulo). New `Opcode::LookupArray` (per-element arrayed-GF apply → array view, #580); `src/compiler/symbolic.rs` gains cross-fragment GF de-duplication (`GfDedup`, #582) and `TempStrategy { Recycle, Sum }`. Per-element graphical-function tables now lay out by element-name → declared dimension index (`variable.rs::reorder_arrayed_element_tables`, `db.rs::extract_tables_from_source_var`), not `Equation::Arrayed` Vec position. New tests: `db_dep_graph_tests.rs`, `db_combined_fragment_tests.rs`, `per_element_gf_tests.rs`; `tests/simulate.rs` `simulates_clearn` is un-stubbed (`#[ignore]` for runtime only) with a hardened `ensure_vdf_results` comparator + `EXPECTED_VDF_RESIDUAL` carve-out.) Earlier: Vensim macro support, Phases 1-7 complete. `:MACRO:`/`` definitions import as macro-marked `datamodel::Model`s (`Model.macro_spec: Option`, persisted through protobuf/JSON/schema); single-output macros inline through `BuiltinVisitor`, multi-output (`:`-list) ones materialize at import. New top-level modules: `module_functions.rs` (the unified `ModuleFunctionDescriptor`/`MacroRegistry` resolver+validator for stdlib functions *and* macros, the shared `is_renamed_*` collision predicates) and `db_macro_registry.rs` (the `project_macro_registry` salsa query + sync-time `macro_registry_build_error`); `SourceProject` gains `macro_registry_build_error`, `SourceModel` gains `macro_spec`; `ErrorCode::DuplicateMacroName`; new `tests/metasd_macros.rs` (gated on `file_io`). LTM arrays hardening, Phases 1-8 complete. Phase 7: #502 per-element graphical-function static link polarity -- when an arrayed source feeds an arrayed graphical-function target, `lookup_table_polarity` folds the per-element `tables` list on `Variable::Var` into one link polarity, falling back to `Unknown` for the multi-dim case; #492 the GF strict-monotonicity check uses a y-range-relative epsilon (`max(EPSILON, range_rel * (y_max - y_min))`) so numeric-import noise no longer flips a monotone lookup table to `Unknown`. Phase 6: #483 analytic STDDEV ceteris-paribus partial -- `generate_nonlinear_partial` builds the unrolled population-variance `sqrt` formula for STDDEV (divisor `N`, matching `vm.rs::Opcode::ArrayStddev`) instead of the delta-ratio stand-in; RANK keeps the delta-ratio (an order statistic, unreachable via real models since it returns an array) with a documented justification, pinned by `test_generate_rank_keeps_delta_ratio`. Phase 5: #515 budgeted cross-element-through-aggregate loop recovery -- `recover_cross_agg_loops` drops the old `MAX_AGG_PETALS = 8` hard drop for a deterministic petal priority + a threaded `agg_loop_budget` loop-count budget (`MAX_CROSS_AGG_LOOPS = 256`, `#[cfg(test)]`-overridable via `AggLoopBudgetGuard`; `MAX_AGG_PETALS` survives as a soft per-agg petal cap), surfaces truncation on `LtmVariablesResult.agg_recovery_truncated` + a `Warning`, and enumerates each disjoint petal subset's distinct *cyclic orderings* (`cyclic_orderings(m)` -- (m-1)!/2 for m≥3, mirror reversals skipped, via Heap's algorithm) instead of one ordering per subset. Phase 4 (2026-05-12): #514 sliced-reducer hoisting -- `AggNode.read_slice`, read-slice-driven element graph / link scores, dynamic-index carve-out reclassified as `DynamicIndex`; arrayed synthetic aggs route through agg-half link-score emitters with subscripted agg names and a subscripted Δsource denominator in the diagonal case (strict-prefix broadcast over-subscribes -- GH #528); mapped-dimension sliced reducers stay conservative; a scalar feeder of a hoisted reducer emits a bare element-graph node. Phases 1-3: the `model_ltm_reference_sites` classification IR (`db_ltm_ir.rs`), the consolidated `reducer_kind`/`ReducerKind` table in `ltm_agg.rs`, element-level A2A `Loop::stocks` + per-slot `loop_partitions` (#487), iterated-dimension subscripts ⇒ `Bare` (#511), disjoint-dim arrayed→arrayed per-source-element link scores + the unscoreable-edge `Warning` (#510).)** +**Last updated: 2026-05-21 (#606: a standalone lookup-only variable -- a graphical-function holder with no functional input -- is now a non-value-bearing **static table** (`Variable::Var::is_table_only` / `db::source_var_is_table_only`), NOT a runtime variable: excluded from the runlist and the saved output (produces no series), its data reached only via `LOOKUP(table, x)` call sites. A table reference is kept off the data-flow dependency graph by a dedicated `builtins::BuiltinContents::LookupTable` walk variant; `referenced_tables` on `VariableDeps`/`ImplicitVarDeps` re-supplies the fragment compiler's metadata + tables map. A bare reference (no argument) is a compile error (`ErrorCode::LookupReferencedWithoutArgument`, emitted in `db_var_fragment::lower_var_fragment`). The MDL importer emits the canonical empty-equation form; the `"0+0"` `LOOKUP_SENTINEL` is still ACCEPTED on read (now produced only for empty-RHS vars). The shared lookup-only predicate moved to `src/variable.rs`. Retired the `gf(Time)` lowering (`lookup_only_index_expr`, `LookupOnlyLayout`). C-LEARN `EXPECTED_VDF_RESIDUAL` shrank 13->4. SUPERSEDES the #590 `gf(Time)` primitive (1) below.) Earlier (2026-05-20): C-LEARN residual closure (#590/#591), as five general Vensim import/simulation primitives, not model-specific patches: (1) a standalone lookup-only variable -- a graphical-function holder with no functional input -- lowers uniformly to `gf(Time)` across scalar/A2A/arrayed shapes (`src/compiler/mod.rs`: `var_is_lookup_only`/`is_lookup_only`/`lookup_only_index_expr` + `LookupOnlyLayout { PerElement, Shared }`); (2) a genuine passthrough macro `:MACRO: INIT(x) = INITIAL(x)` collapses to the builtin opcode at the call site (`module_functions.rs`: `classify_passthrough` + a `passthrough: Option` field on `ModuleFunctionDescriptor`, classified at `MacroRegistry::build`), instead of expanding a buggy per-element synthetic module; (3) the import-time XMILE formatter no longer linearizes a shadowed `RAMP FROM TO` macro -- `xmile_compat.rs::format_call_ctx` carries a macro-shadowing audit and the `ramp from to` restructuring arm was removed so the call survives as `RAMP_FROM_TO(...)` and resolves through the macro path; (4) the simulation **initials runlist is now deterministic** (`db_dep_graph.rs` sorts the init set before `topo_sort_str`, GH #595) and the dt stock-submodel-output chain-break applies to ALL readers (parity with the legacy `model.rs::module_output_deps` gate); (5) the VDF reader re-binds a **standalone** graphical-function descriptor to its forward-link output OT (`record_results.rs::standalone_descriptor_rebinds`). New `#[cfg(test)] lookup_only_tests.rs`; the C-LEARN `EXPECTED_VDF_RESIDUAL` carve-out (`tests/simulate.rs`) is now an exact, taxonomy-attributed remainder pinned by `clearn_residual_exactness`. simlin-cli now resolves `GET DIRECT *` external data via a `FilesystemDataProvider` -- see its CLAUDE.md.) Earlier: Element-level cycle resolution + genuine-Vensim VECTOR ELM MAP/SORT ORDER + `:NA:` sentinel, the work that made C-LEARN compile via the incremental path, run to FINAL TIME, and match genuine Vensim (`Ref.vdf`) within the 1% cross-simulator tolerance on the matched floor. The whole-variable `model_dependency_graph` cycle gate now refines a recurrence SCC to an element-acyclic verdict over a cross-member-comparable symbolic `SymVarRef` element graph (`db_dep_graph.rs`: `resolve_recurrence_sccs`/`refine_scc_to_element_verdict`/`symbolic_phase_element_order`, GH #575); a resolved SCC's per-element symbolic segments are interleaved into one combined fragment along the SCC's `element_order` and injected at `assemble_module` (`db.rs`: `combine_scc_fragment`/`var_phase_symbolic_fragment_prod`). Per-variable lowering moved to a new sibling module `db_var_fragment.rs` (`lower_var_fragment`). `crate::float::NA` is the finite Vensim `:NA:` sentinel (`-2^109`, NOT IEEE NaN); both `:NA:` paths route to it. New top-level VM-adjacent modules `vm_vector_sort_order.rs` (arrayed VECTOR SORT ORDER, per-iterated-slice 0-based ranks, #585) and `vm_vector_elm_map.rs` (base+full-source, OOB→NaN, no modulo). New `Opcode::LookupArray` (per-element arrayed-GF apply → array view, #580); `src/compiler/symbolic.rs` gains cross-fragment GF de-duplication (`GfDedup`, #582) and `TempStrategy { Recycle, Sum }`. Per-element graphical-function tables now lay out by element-name → declared dimension index (`variable.rs::reorder_arrayed_element_tables`, `db.rs::extract_tables_from_source_var`), not `Equation::Arrayed` Vec position. New tests: `db_dep_graph_tests.rs`, `db_combined_fragment_tests.rs`, `per_element_gf_tests.rs`; `tests/simulate.rs` `simulates_clearn` is un-stubbed (`#[ignore]` for runtime only) with a hardened `ensure_vdf_results` comparator + `EXPECTED_VDF_RESIDUAL` carve-out.) Earlier: Vensim macro support, Phases 1-7 complete. `:MACRO:`/`` definitions import as macro-marked `datamodel::Model`s (`Model.macro_spec: Option`, persisted through protobuf/JSON/schema); single-output macros inline through `BuiltinVisitor`, multi-output (`:`-list) ones materialize at import. New top-level modules: `module_functions.rs` (the unified `ModuleFunctionDescriptor`/`MacroRegistry` resolver+validator for stdlib functions *and* macros, the shared `is_renamed_*` collision predicates) and `db_macro_registry.rs` (the `project_macro_registry` salsa query + sync-time `macro_registry_build_error`); `SourceProject` gains `macro_registry_build_error`, `SourceModel` gains `macro_spec`; `ErrorCode::DuplicateMacroName`; new `tests/metasd_macros.rs` (gated on `file_io`). LTM arrays hardening, Phases 1-8 complete. Phase 7: #502 per-element graphical-function static link polarity -- when an arrayed source feeds an arrayed graphical-function target, `lookup_table_polarity` folds the per-element `tables` list on `Variable::Var` into one link polarity, falling back to `Unknown` for the multi-dim case; #492 the GF strict-monotonicity check uses a y-range-relative epsilon (`max(EPSILON, range_rel * (y_max - y_min))`) so numeric-import noise no longer flips a monotone lookup table to `Unknown`. Phase 6: #483 analytic STDDEV ceteris-paribus partial -- `generate_nonlinear_partial` builds the unrolled population-variance `sqrt` formula for STDDEV (divisor `N`, matching `vm.rs::Opcode::ArrayStddev`) instead of the delta-ratio stand-in; RANK keeps the delta-ratio (an order statistic, unreachable via real models since it returns an array) with a documented justification, pinned by `test_generate_rank_keeps_delta_ratio`. Phase 5: #515 budgeted cross-element-through-aggregate loop recovery -- `recover_cross_agg_loops` drops the old `MAX_AGG_PETALS = 8` hard drop for a deterministic petal priority + a threaded `agg_loop_budget` loop-count budget (`MAX_CROSS_AGG_LOOPS = 256`, `#[cfg(test)]`-overridable via `AggLoopBudgetGuard`; `MAX_AGG_PETALS` survives as a soft per-agg petal cap), surfaces truncation on `LtmVariablesResult.agg_recovery_truncated` + a `Warning`, and enumerates each disjoint petal subset's distinct *cyclic orderings* (`cyclic_orderings(m)` -- (m-1)!/2 for m≥3, mirror reversals skipped, via Heap's algorithm) instead of one ordering per subset. Phase 4 (2026-05-12): #514 sliced-reducer hoisting -- `AggNode.read_slice`, read-slice-driven element graph / link scores, dynamic-index carve-out reclassified as `DynamicIndex`; arrayed synthetic aggs route through agg-half link-score emitters with subscripted agg names and a subscripted Δsource denominator in the diagonal case (strict-prefix broadcast over-subscribes -- GH #528); mapped-dimension sliced reducers stay conservative; a scalar feeder of a hoisted reducer emits a bare element-graph node. Phases 1-3: the `model_ltm_reference_sites` classification IR (`db_ltm_ir.rs`), the consolidated `reducer_kind`/`ReducerKind` table in `ltm_agg.rs`, element-level A2A `Loop::stocks` + per-slot `loop_partitions` (#487), iterated-dimension subscripts ⇒ `Bare` (#511), disjoint-dim arrayed→arrayed per-source-element link scores + the unscoreable-edge `Warning` (#510).)** **Maintenance note**: Keep this file up to date when adding, removing, or reorganizing modules. @@ -15,7 +15,7 @@ Equation text flows through these stages in order: 3. **`src/ast/`** - AST type system with progressive lowering: `Expr0` (parsed) -> `Expr1` (modules expanded) -> `Expr2` (dimensions resolved) -> `Expr3` (subscripts expanded). `array_view.rs` tracks array dimensions and sparsity. `Expr2Context` trait includes `has_mapping_to()` for cross-dimension mapping lookups during `find_matching_dimension`. 4. **`src/builtins.rs`** - Builtin function definitions (e.g. `MIN`, `PULSE`, `LOOKUP`, `QUANTUM`, `SSHAPE`, `VECTOR SELECT`, `VECTOR ELM MAP`, `VECTOR SORT ORDER`, `VECTOR RANK`, `ALLOCATE AVAILABLE`, `ALLOCATE BY PRIORITY`, `NPV`, `MODULO`, `PREVIOUS`, `INIT`). `is_stdlib_module_function()` is the authoritative predicate for deciding whether a function name expands to a stdlib module; the **module-function** generalization that also resolves project macros lives in **`src/module_functions.rs`** -- the pure (Functional-Core) `ModuleFunctionDescriptor`/`MacroRegistry` resolver+validator unifying stdlib functions and macros (`stdlib_descriptor`, `MacroRegistry::build`/`resolve_macro`, the shared `is_renamed_opcode_intrinsic`/`is_renamed_stdlib_module_builtin`/`is_renamed_builtin_macro_collision` precedence predicates). A **genuine passthrough macro** -- a single-parameter, single-output macro whose body is exactly `out = BUILTIN(param)` self-calling its own renamed-builtin-collision name (`:MACRO: INIT(x) = INITIAL(x)`, stored after the importer's `INITIAL`->`INIT` rename as `init = init(x)`) -- is classified once at `MacroRegistry::build` time (`classify_passthrough`, the only place the body AST is available) into a `passthrough: Option` on the descriptor; `builtins_visitor.rs` reads it to *collapse the call to the builtin opcode* (`init`->`LoadInitial`) instead of expanding the buggy per-element synthetic module (#591), falling through to the same renamed-builtin intrinsic routing the #554 self-call exception takes. Strict criteria (no additional outputs, bare-parameter argument, self-call, renamed-builtin-collision name) keep it from misfiring on a near-miss like `INIT = INIT(x) + 1`. `equation_is_module_call()` (pre-scan, renamed from `equation_is_stdlib_call`) and `contains_module_call()` (walk-time, renamed from `contains_stdlib_call`) both consult the `MacroRegistry` (a passthrough caller is still classified module-backed here -- benign, since it collapses to a flat-slot variable). `builtins_visitor.rs` handles implicit module instantiation, single-output macro inlining (an arrayed macro invocation enters the per-element path), and PREVIOUS/INIT helper rewriting: unary `PREVIOUS(x)` desugars to `PREVIOUS(x, 0)`, direct scalar args compile to `LoadPrev`, and module-backed or expression args are first rewritten through synthesized scalar helper auxes. `INIT(x)` compiles to `LoadInitial`, using the same helper rewrite when needed. Tracks `module_idents` so `PREVIOUS(module_var)` never reads a multi-slot module directly. 5. **`src/compiler/`** - Multi-pass compilation to bytecode: - - `mod.rs` - Orchestration; includes A2A hoisting logic that detects array-producing builtins (VectorElmMap, VectorSortOrder, Rank, AllocateAvailable, AllocateByPriority) during array expansion, hoists them into `AssignTemp` pre-computations, and emits per-element `TempArrayElement` reads. Also lowers a **standalone lookup-only variable** -- a graphical-function holder whose equation is empty (XMILE) or the MDL lookup sentinel (`var_is_lookup_only`/`is_lookup_only`, mirroring `mdl::writer::is_lookup_only_equation`'s "empty or sentinel" rule) -- uniformly to `gf(Time)` across the scalar, A2A, and arrayed shapes (`lookup_only_index_expr`; the index is the current simulation time, #590). WITH LOOKUP (`var = WITH LOOKUP(input, table)`: tables present *and* a real input) keeps its `LOOKUP(self, input)` lowering and is NOT lookup-only. An arrayed lookup-only variable's `LookupOnlyLayout` (`PerElement` when `tables.len() == n` -- each element wraps `off + i`; `Shared` when `tables.len() == 1` -- every element wraps the base `off`) is derived from the table count `variable.rs::build_tables` produced, so the LOOKUP offset matches the layout and `element_offset` stays in range + - `mod.rs` - Orchestration; includes A2A hoisting logic that detects array-producing builtins (VectorElmMap, VectorSortOrder, Rank, AllocateAvailable, AllocateByPriority) during array expansion, hoists them into `AssignTemp` pre-computations, and emits per-element `TempArrayElement` reads. Treats a **standalone lookup-only variable** -- a graphical-function holder whose equation is empty or the legacy MDL `"0+0"` sentinel (`crate::variable::var_is_lookup_only`/`is_empty_or_sentinel`, mirroring `mdl::writer::is_lookup_only_equation`'s "empty or sentinel" rule) -- as a non-value-bearing **static table** (`Variable::Var::is_table_only`): it is excluded from every runlist and from the saved output, so it produces NO series of its own (issue #606). Its data is reached only through `LOOKUP(table, x)` call sites (resolved by ident -> `base_gf`); a *bare* reference with no argument is a compile error (`ErrorCode::LookupReferencedWithoutArgument`). WITH LOOKUP (`var = WITH LOOKUP(input, table)`: tables present *and* a real input) is NOT lookup-only -- it is a value-bearing variable that lowers to `LOOKUP(self, input)` - `context.rs` - Symbol tables and variable metadata; `lower_preserving_dimensions()` skips Pass 1 dimension resolution to keep full array views for array-producing builtins. Handles `@N` position syntax resolution: in scalar context (no active A2A dimension, not inside an array-reducing builtin), `DimPosition(@N)` resolves to a concrete element offset; inside array-reducing builtins (`preserve_wildcards_for_iteration`), dimension views are preserved for iteration. Two wildcard-preservation contexts: `with_preserved_wildcards()` for reducers (SUM, MEAN, etc.) where `ActiveDimRef` resolves to a concrete offset, and `with_vector_builtin_wildcards()` for array-producing builtins (VectorSortOrder, VectorElmMap, etc.) where `ActiveDimRef` is promoted to `Wildcard` to preserve the full array view - `expr.rs` - Expression compilation - `codegen.rs` - Bytecode emission; routes array-producing builtins through dedicated opcodes instead of element-wise iteration. `emit_array_reduce()` is the shared helper for single-argument array builtins (SUM, SIZE, STDDEV, MIN, MAX, MEAN): pushes view, emits reduction opcode, pops view @@ -77,7 +77,7 @@ The primary compilation path uses salsa tracked functions for fine-grained incre - `writer.rs` - MDL output: variable equations (with original casing, native LOOKUP syntax, backslash continuations), `:MACRO:`/`:END OF MACRO:` blocks for macro-marked models (reconstructing multi-output invocations), sketch sections (view splitting on Group boundaries, element ordering) - `xmile_compat.rs` - Expression formatting for XMILE output. Carries a documented **macro-shadowing audit** (`format_call_ctx`): arms that *restructure* a builtin-named call into a different expression at import time (`MODULO(x,y)`->`(x) MOD (y)`, `SAMPLE IF TRUE`, etc.) run BEFORE compile-time macro resolution, so they silently pre-empt a same-named `:MACRO:` -- the opposite of Vensim semantics, where a project macro shadows the like-named builtin. The `ramp from to` arm (the one C-LEARN tripped) was removed so `RAMP FROM TO(...)` survives import as `RAMP_FROM_TO(...)` and resolves through the macro path; the other restructuring arms carry the same latent hazard but are not currently shadowed by any in-repo model. Name-preserving renames (e.g. `SSHAPE`, a genuine builtin not restructured here) are safe and shadow cleanly - `settings.rs` - Integration settings parser -- **`src/vdf.rs`** - Vensim VDF (binary data file) parser. Parses all structural elements (sections, records, name/slot/offset tables, data blocks) and extracts results via `to_results_via_records()`: each section-1 record's `f[2]` is a direct key into the section-2 name table and its `f[11]` is the variable's OT-block start; graphical-function descriptor records are peeled off so the remaining owner spans form a clean OT partition. Submodules: `record_results.rs` (the record-to-OT mapping pipeline; besides peeling *overlapping* descriptors, `standalone_descriptor_rebinds` re-binds a **standalone** lookup-only descriptor -- a graphical-function variable Vensim saves *only* as a descriptor, with no separate consumer-owner record, so it would otherwise decode at its spurious `f[11]`-as-OT-start stock ghost slot -- to its forward-link evaluated-output OT `lookup_record[f[11]].word[10]`, gated on the ghost slot carrying the stock class code and the forward link being a valid owner data OT; scalar only, arrayed standalone lookups are deferred), `section3.rs` (array-shape directory), `section5_dims.rs` (dimension/element recovery), `signatures.rs` (the two `#`-signature stdlib-call encodings). See `docs/design/vdf.md` for the format specification. +- **`src/vdf.rs`** - Vensim VDF (binary data file) parser. Parses all structural elements (sections, records, name/slot/offset tables, data blocks) and extracts results via `to_results_via_records()`: each section-1 record's `f[2]` is a direct key into the section-2 name table and its `f[11]` is the variable's OT-block start; graphical-function descriptor records are peeled off so the remaining owner spans form a clean OT partition. Submodules: `record_results.rs` (the record-to-OT mapping pipeline; besides peeling *overlapping* descriptors, `standalone_lookup_only_descriptors` DROPS a **standalone** lookup-only descriptor -- a graphical-function variable Vensim saves *only* as a descriptor, with no separate series -- because a bare lookup is a table, not a time series (the engine likewise produces no series for one, #606); the few descriptors the model-free reader cannot safely distinguish from a real owner remain as harmless ghost columns the comparator skips), `section3.rs` (array-shape directory), `section5_dims.rs` (dimension/element recovery), `signatures.rs` (the two `#`-signature stdlib-call encodings). See `docs/design/vdf.md` for the format specification. - **`src/systems/`** - Systems format (`.txt`) parser, translator, and writer. A line-oriented notation for stock-and-flow models with implicit flow typing (Rate, Conversion, Leak). Pipeline: `lexer.rs` -> `parser.rs` -> `ast.rs` (IR) -> `translate.rs` (to `datamodel::Project`). Each systems flow becomes a stdlib module instance (`systems_rate`, `systems_leak`, or `systems_conversion`), with chained `available`/`remaining` wiring for multi-outflow stocks. `writer.rs` reconstructs `.txt` format from a translated `datamodel::Project` by inspecting module structure. Public API: `systems::parse()`, `systems::translate::translate()`, `systems::project_to_systems()` - **`src/json.rs`** - JSON serialization matching Go `sd` package schema - **`src/json_sdai.rs`** - JSON schema for AI metadata augmentation. `generate_relationships()` derives `Relationship` entries from pre-computed equation dependency polarities (via `compute_link_polarities`), filtering out stock-flow structural edges and sorting deterministically @@ -128,7 +128,7 @@ The primary compilation path uses salsa tracked functions for fine-grained incre - **`src/test_common.rs`**, **`src/testutils.rs`** - Helpers and fixtures (e.g. `x_model`, `x_stock`, `x_flow`). `TestProject` is the primary builder for test models: chainable methods like `.aux()`, `.stock()`, `.flow()`, `.array_aux()`, `.array_stock()`, `.array_flow()`, `.named_dimension()`, and `.build_datamodel()` / `.build_sim_with_ltm()`. `assert_compiles_incremental()` verifies the model compiles without errors via the salsa path. - **`src/array_tests.rs`** - Array-specific tests (feature-gated) - **`src/per_element_gf_tests.rs`** - Per-element graphical-function (arrayed GF) layout invariant: a per-element GF table must land at the element's *declared dimension index* (row-major), not its `Equation::Arrayed` `elems` Vec position. Uses deliberately NON-alphabetical declared orders (where position != dim-index) so a positional mis-map is observable; covers the scalar `Lookup` path, the `LookupArray` (VECTOR SELECT) path, the MDL-importer-sort twin, sparse + multi-dim, the salsa dependency-table path (`extract_tables_from_source_var`), and a compile-time/perf structural guard that the reorder is materialized at compile time and the VM opcode carries no element name. The LTM-polarity twin lives in the `ltm` tests module (`test_per_element_gf_link_polarity_fixed_index_non_sorted_order`). -- **`src/lookup_only_tests.rs`** - End-to-end tests for the standalone-lookup-only `gf(Time)` lowering (#590): scalar, A2A, and arrayed (both `LookupOnlyLayout::Shared` and `PerElement`) holders evaluate their table at the current simulation time, while WITH LOOKUP keeps feeding its real input. The pure `is_lookup_only`/`var_is_lookup_only` predicate units live inline in `src/compiler/mod.rs`. +- **`src/lookup_only_tests.rs`** - End-to-end tests for the standalone-lookup-only **static table** contract (#606): a lookup-only holder (scalar, A2A, arrayed) produces NO saved series; a consumer that calls it (`LOOKUP(g, x)`) reads the table, two calls at different arguments are independent, a bare reference is a compile error, and the legacy `"0+0"` form is still accepted. The pure `is_lookup_only`/`var_is_lookup_only` predicate units live in `src/variable.rs`. - **`src/json_proptest.rs`**, **`src/json_sdai_proptest.rs`** - Property-based tests - **`src/unit_checking_test.rs`** - Unit checking regression tests - **`src/test_sir_xmile.rs`** - SIR epidemiology model integration tests diff --git a/src/simlin-engine/src/ast/expr2.rs b/src/simlin-engine/src/ast/expr2.rs index 8e992c6a9..aa168ad63 100644 --- a/src/simlin-engine/src/ast/expr2.rs +++ b/src/simlin-engine/src/ast/expr2.rs @@ -998,7 +998,8 @@ impl Expr2 { loc = Some(id_loc); } } - BuiltinContents::Expr(expr) => { + // The lookup table identity is a locatable reference too. + BuiltinContents::Expr(expr) | BuiltinContents::LookupTable(expr) => { if loc.is_none() { loc = expr.get_var_loc(ident); } diff --git a/src/simlin-engine/src/ast/mod.rs b/src/simlin-engine/src/ast/mod.rs index 23275204c..52481e394 100644 --- a/src/simlin-engine/src/ast/mod.rs +++ b/src/simlin-engine/src/ast/mod.rs @@ -671,7 +671,10 @@ impl LatexVisitor { walk_builtin_expr(builtin, |contents| { let arg = match contents { BuiltinContents::Ident(id, _loc) => format!("\\mathrm{{{id}}}"), - BuiltinContents::Expr(expr) => self.walk(expr), + // The lookup table identity is a printed argument too. + BuiltinContents::Expr(expr) | BuiltinContents::LookupTable(expr) => { + self.walk(expr) + } }; args.push(arg); }); diff --git a/src/simlin-engine/src/builtins.rs b/src/simlin-engine/src/builtins.rs index 4b5e4e4bc..3d3dc9d38 100644 --- a/src/simlin-engine/src/builtins.rs +++ b/src/simlin-engine/src/builtins.rs @@ -453,6 +453,15 @@ pub fn is_builtin_fn(name: &str) -> bool { pub(crate) enum BuiltinContents<'a, Expr> { Ident(&'a str, Loc), Expr(&'a Expr), + /// The table-identity argument of a graphical-function lookup + /// (`LOOKUP`/`LOOKUP FORWARD`/`LOOKUP BACKWARD`) -- a reference to a + /// gf-holding variable (a standalone lookup-only table, or a value-bearing + /// WITH LOOKUP variable's own table), NOT a runtime value input. The static + /// table data is laid out at compile time independent of the runlist, so + /// dependency/causal walkers must treat this as a non-edge (a table + /// reference imposes no runtime data dependency); printing and + /// source-location walkers treat it like any other argument expression. + LookupTable(&'a Expr), } pub(crate) fn walk_builtin_expr<'a, Expr, F>(builtin: &'a BuiltinFn, mut cb: F) @@ -470,7 +479,7 @@ where BuiltinFn::Lookup(table_expr, index_expr, _loc) | BuiltinFn::LookupForward(table_expr, index_expr, _loc) | BuiltinFn::LookupBackward(table_expr, index_expr, _loc) => { - cb(BuiltinContents::Expr(table_expr)); + cb(BuiltinContents::LookupTable(table_expr)); cb(BuiltinContents::Expr(index_expr)); } BuiltinFn::Abs(a) diff --git a/src/simlin-engine/src/common.rs b/src/simlin-engine/src/common.rs index 3f557032b..06f7f911c 100644 --- a/src/simlin-engine/src/common.rs +++ b/src/simlin-engine/src/common.rs @@ -108,6 +108,11 @@ pub enum ErrorCode { // appended freely. Keep additions at the END of the enum anyway, to keep // the existing discriminants stable for any in-memory consumers. DuplicateMacroName, + /// A standalone lookup-only table (a graphical function with no driving + /// input) was referenced bare -- without applying it to an argument. A + /// table has no scalar value of its own; it must be called, e.g. + /// `LOOKUP(my_table, x)` or `my_table(x)` (issue #606). + LookupReferencedWithoutArgument, } impl fmt::Display for ErrorCode { @@ -168,6 +173,7 @@ impl fmt::Display for ErrorCode { BadOverride => "bad_override", UnsupportedForSerialization => "unsupported_for_serialization", DuplicateMacroName => "duplicate_macro_name", + LookupReferencedWithoutArgument => "lookup_referenced_without_argument", }; write!(f, "{name}") diff --git a/src/simlin-engine/src/compiler/mod.rs b/src/simlin-engine/src/compiler/mod.rs index c5c9f25bf..748ed8f29 100644 --- a/src/simlin-engine/src/compiler/mod.rs +++ b/src/simlin-engine/src/compiler/mod.rs @@ -720,10 +720,8 @@ impl Var { exprs.push(Expr::AssignCurr(off, Box::new(main_expr))); exprs } - // Stocks never carry graphical functions, so they - // are never lookup-only (`None`). Ast::ApplyToAll(dims, ast) => { - expand_a2a_with_hoisting(ctx, dims, ast, off, false)? + expand_a2a_with_hoisting(ctx, dims, ast, off)? } Ast::Arrayed( dims, @@ -737,7 +735,6 @@ impl Var { default_ast.as_ref(), *apply_default_for_missing, off, - None, )?, } } else { @@ -770,7 +767,24 @@ impl Var { } } } - Variable::Var { tables, eqn, .. } => { + Variable::Var { + tables, + is_table_only, + .. + } => { + // A standalone lookup-only table is a static table consulted + // by callers via `LOOKUP(self, x)`, not a value-bearing + // variable: it is excluded from every runlist and produces no + // expression of its own. Return an empty fragment -- this + // keeps the diagnostic pass (which compiles every variable) + // from flagging its empty equation, and the fragment is never + // assembled into a runlist anyway (issue #606). + if *is_table_only { + return Ok(Var { + ident: Ident::new(var.ident()), + ast: vec![], + }); + } let off = ctx.get_base_offset(&Ident::new(var.ident()))?; let ast = if ctx.is_initial { var.init_ast() @@ -780,33 +794,23 @@ impl Var { if ast.is_none() { return sim_err!(EmptyEquation, var.ident().to_string()); } - // A standalone lookup-only variable (a graphical-function - // holder with no functional input) is lowered uniformly to - // `gf(Time)` across all three shapes below. WITH LOOKUP - // (tables + a real input) keeps its `LOOKUP(self, input)` - // lowering; ordinary auxes are untouched. - let lookup_only = var_is_lookup_only(eqn.as_ref(), !tables.is_empty()); match ast.as_ref().unwrap() { Ast::Scalar(ast) => { let mut exprs = ctx.lower(ast)?; let main_expr = exprs.pop().unwrap(); let main_expr = hoist_nested_array_builtins_in_scalar(main_expr, &mut exprs); + // WITH LOOKUP (`var = WITH LOOKUP(input, table)`): a + // tables-bearing variable WITH a real input equation + // lowers to `LOOKUP(self, input)`. (A standalone + // lookup-only table has no input and is handled above; + // ordinary auxes have no tables.) let main_expr = if !tables.is_empty() { let loc = main_expr.get_loc(); - // For lookup-only, evaluate the table at the - // determined index (Time) instead of the - // lowered (constant-0) sentinel equation; WITH - // LOOKUP feeds the user's input expression. - let index = if lookup_only { - lookup_only_index_expr(loc) - } else { - main_expr - }; Expr::App( BuiltinFn::Lookup( Box::new(Expr::Var(off, loc)), - Box::new(index), + Box::new(main_expr), loc, ), loc, @@ -818,27 +822,9 @@ impl Var { exprs } Ast::ApplyToAll(dims, ast) => { - expand_a2a_with_hoisting(ctx, dims, ast, off, lookup_only)? + expand_a2a_with_hoisting(ctx, dims, ast, off)? } Ast::Arrayed(dims, elements, default_ast, apply_default_for_missing) => { - // A lookup-only arrayed variable's table layout is - // decided by `variable.rs::build_tables`: per-element - // tables (`tables.len() == n`) ONLY when at least one - // element carries its own gf, otherwise a single - // shared variable-level table (`tables.len() == 1`, - // the same shape A2A always has). The expansion needs - // that distinction to wrap the right LOOKUP offset - // (element-own `off + i` vs shared base `off`), so - // derive it from the actual table count here. - let lookup_only_layout = lookup_only.then(|| { - let element_count: usize = - dims.iter().map(Dimension::len).product(); - if tables.len() == element_count { - LookupOnlyLayout::PerElement - } else { - LookupOnlyLayout::Shared - } - }); expand_arrayed_with_hoisting( ctx, dims, @@ -846,7 +832,6 @@ impl Var { default_ast.as_ref(), *apply_default_for_missing, off, - lookup_only_layout, )? } } @@ -1120,122 +1105,6 @@ fn contains_array_producing_builtin(expr: &Expr) -> bool { } } -/// Pure predicate: a "lookup-only" variable is a standalone graphical-function -/// holder -- it carries a graphical function (`has_tables`) and its equation is -/// empty or the MDL lookup sentinel (i.e. there is no functional input -/// expression to feed the table). -/// -/// Mirrors `mdl::writer::is_lookup_only_equation`'s "empty or sentinel" rule so -/// that BOTH an XMILE-sourced lookup-only variable (empty/absent equation + -/// `gf`) and an MDL-sourced one (`equation == LOOKUP_SENTINEL` + `gf`) are -/// detected -- the `LOOKUP_SENTINEL` string is MDL-format-internal and is NOT a -/// stable datamodel contract, so the empty-equation arm is load-bearing, not -/// redundant. -/// -/// Returns `false` for WITH LOOKUP (`var = WITH LOOKUP(input, table)`: tables -/// present but a *real* input equation -- it must keep its `LOOKUP(self, -/// input)` lowering) and for an ordinary aux (a real equation, no tables). The -/// `has_tables` guard also keeps an empty-RHS aux (no gf) off this path. -fn is_lookup_only(equation: &str, has_tables: bool) -> bool { - has_tables && is_empty_or_sentinel(equation) -} - -/// An equation string with no functional content: empty/whitespace-only (the -/// XMILE lookup-only form) or exactly the MDL lookup sentinel (the MDL form). -/// The trimmed comparison mirrors `mdl::writer::is_lookup_only_equation`. -fn is_empty_or_sentinel(equation: &str) -> bool { - let trimmed = equation.trim(); - trimmed.is_empty() || trimmed == crate::mdl::LOOKUP_SENTINEL -} - -/// Whether a whole variable is lookup-only, given its source `datamodel::Equation` -/// and whether it carries graphical functions (`has_tables`). A scalar/A2A -/// variable is lookup-only when its single equation is empty/sentinel; an -/// arrayed (`Equation::Arrayed`) variable is lookup-only when EVERY element -/// equation (and the EXCEPT default, if any) is empty/sentinel -- i.e. it is a -/// pure per-element table holder. Returns `false` when the variable has no -/// equation or no tables. -fn var_is_lookup_only(eqn: Option<&crate::datamodel::Equation>, has_tables: bool) -> bool { - use crate::datamodel::Equation; - if !has_tables { - return false; - } - match eqn { - // Scalar / A2A: one equation string -- the pure `is_lookup_only` atom. - Some(Equation::Scalar(s)) | Some(Equation::ApplyToAll(_, s)) => { - is_lookup_only(s, has_tables) - } - // Arrayed: a pure per-element table holder iff EVERY element equation - // (and the EXCEPT default, if any) is empty/sentinel. `has_tables` is - // already true here, so the per-element check is just the string - // predicate `is_empty_or_sentinel` that the atom is built on. - Some(Equation::Arrayed(_, elements, default, _)) => { - !elements.is_empty() - && elements.iter().all(|(_, e, _, _)| is_empty_or_sentinel(e)) - && default.as_deref().map(is_empty_or_sentinel).unwrap_or(true) - } - None => false, - } -} - -/// The index expression a standalone lookup-only variable's table is evaluated -/// at. Determined empirically (Phase 1, #590) to be the current simulation time -/// -- `gf(Time)` -- matching genuine Vensim for the year-indexed historical -/// lookup tables that motivate this, and consistent with the importer's -/// `MdlEquation::Implicit` arm (`mdl/convert/variables.rs`), which already -/// lowers an unspecified-input gf to `equation = "TIME"`. Centralized here so -/// the scalar, arrayed, and A2A lowering sites stay in lock-step. -fn lookup_only_index_expr(loc: Loc) -> Expr { - Expr::App(BuiltinFn::Time, loc) -} - -#[cfg(test)] -mod is_lookup_only_tests { - use super::*; - - #[test] - fn is_lookup_only_sentinel_equation_with_tables_is_true() { - // MDL-sourced lookup-only: `g()` imports as equation = - // LOOKUP_SENTINEL ("0+0") + a graphical function. - assert!(is_lookup_only(crate::mdl::LOOKUP_SENTINEL, true)); - } - - #[test] - fn is_lookup_only_empty_equation_with_tables_is_true() { - // XMILE-sourced lookup-only: a variable with a `` but no equation - // carries an empty/absent equation rather than the MDL sentinel. - assert!(is_lookup_only("", true)); - // Whitespace-only is equivalent to empty (matches the `.trim()` in - // `mdl::writer::is_lookup_only_equation`). - assert!(is_lookup_only(" ", true)); - } - - #[test] - fn is_lookup_only_with_lookup_real_input_is_false() { - // WITH LOOKUP (`g = WITH LOOKUP(some_input,
)`): tables present - // but a real input expression. Must NOT be treated as lookup-only -- - // it keeps `LOOKUP(self, input)` so the index is the user's input - // (this is the AC1.4 distinction). - assert!(!is_lookup_only("some_input", true)); - } - - #[test] - fn is_lookup_only_ordinary_aux_no_tables_is_false() { - // An ordinary aux: a real equation and no graphical function. - assert!(!is_lookup_only("3 * x + 1", false)); - } - - #[test] - fn is_lookup_only_sentinel_equation_without_tables_is_false() { - // An empty-RHS aux (MdlEquation::EmptyRhs): the sentinel equation but - // NO graphical function. `has_tables == false` keeps it off the - // lookup-only path even though its equation text matches the sentinel. - assert!(!is_lookup_only(crate::mdl::LOOKUP_SENTINEL, false)); - // The empty-equation, no-tables case is likewise not lookup-only. - assert!(!is_lookup_only("", false)); - } -} - /// Test-only wrapper exposing the production recursive /// array-producing-builtin predicate (`contains_array_producing_builtin`, /// which delegates to the private `is_array_producing_builtin`): true iff @@ -1748,24 +1617,6 @@ fn array_view_from_dims(dims: &[Dimension]) -> ArrayView { ArrayView::contiguous_with_names(dim_sizes, dim_names) } -/// The graphical-function table layout of a lookup-only arrayed variable, which -/// determines the LOOKUP offset its expansion must wrap. The two variants mirror -/// the two layouts `variable.rs::build_tables` can produce for an arrayed -/// variable, so that an invalid "lookup-only but neither layout" state is -/// unrepresentable (the caller derives the variant from `tables.len()`). -#[derive(Debug, Clone, Copy, PartialEq, Eq)] -enum LookupOnlyLayout { - /// One table per element (`tables.len() == n`), laid out by row-major - /// declared dimension index. Produced ONLY when at least one element carries - /// its OWN gf. Element `i` reads its own table, so wrap `off + i`. - PerElement, - /// A single variable-level table shared by every element (`tables.len() == - /// 1`). Produced when the arrayed variable has empty/sentinel element - /// equations and one variable-level gf -- the same shape A2A always has. Every - /// element reads that one table, so wrap the BASE offset `off`. - Shared, -} - /// Handle the Arrayed expansion, detecting array-producing builtins in /// per-element expressions and hoisting them into AssignTemp pre-computations. /// @@ -1774,10 +1625,6 @@ enum LookupOnlyLayout { /// be evaluated once for the whole array and stored in temp. Each element then /// reads its result via TempArrayElement. /// -/// `lookup_only` is `Some(layout)` iff the variable is a standalone -/// graphical-function holder (every element equation empty/sentinel); the layout -/// selects the LOOKUP offset (see [`LookupOnlyLayout`]). It is `None` for an -/// ordinary arrayed variable. fn expand_arrayed_with_hoisting( ctx: &Context, dims: &[Dimension], @@ -1785,53 +1632,9 @@ fn expand_arrayed_with_hoisting( default_ast: Option<&crate::ast::Expr2>, apply_default_for_missing: bool, off: usize, - lookup_only: Option, ) -> Result> { let active_dims = Arc::<[Dimension]>::from(dims.to_vec()); - // A lookup-only arrayed variable is a pure table holder: every element - // evaluates a table at the determined index (Time). WHICH table it reads - // depends on the layout `variable.rs::build_tables` chose, so the LOOKUP - // offset must match it: - // - // * `LookupOnlyLayout::PerElement` (`tables.len() == n`): element `i` reads - // its own table, so wrap `off + i`. `codegen::extract_table_info` derives - // `elem_off = off+i - base_off = i` and `table_count = n`, so `i` is always - // in range. - // * `LookupOnlyLayout::Shared` (`tables.len() == 1`): every element reads the - // one shared variable-level table, so wrap the BASE offset `off`, exactly - // like `expand_a2a_with_hoisting`: `elem_off = off - base_off = 0` and - // `table_count = 1`, so element_offset 0 is in range for all elements. - // Wrapping `off + i` here would make `element_offset == i >= 1 == - // table_count` for i > 0, and the VM's `Lookup` opcode would push NaN. - // - // (A lookup-only equation is a constant sentinel, never an array-producing - // builtin, so this never needs the hoisting path below.) - if let Some(layout) = lookup_only { - let exprs: Vec = SubscriptIterator::new(dims) - .enumerate() - .map(|(i, _)| { - let loc = Loc::default(); - let table_off = match layout { - LookupOnlyLayout::PerElement => off + i, - LookupOnlyLayout::Shared => off, - }; - Expr::AssignCurr( - off + i, - Box::new(Expr::App( - BuiltinFn::Lookup( - Box::new(Expr::Var(table_off, loc)), - Box::new(lookup_only_index_expr(loc)), - loc, - ), - loc, - )), - ) - }) - .collect(); - return Ok(exprs); - } - // Scan ALL subscript combinations to find any equation that needs hoisting. // The first element alone may be a constant override while later elements // use a default (or explicit equation) containing array-producing builtins. @@ -1914,41 +1717,9 @@ fn expand_a2a_with_hoisting( dims: &[Dimension], ast: &crate::ast::Expr2, off: usize, - lookup_only: bool, ) -> Result> { let active_dims = Arc::<[Dimension]>::from(dims.to_vec()); - // An apply-to-all lookup-only variable has ONE variable-level table - // (`variable.rs::build_tables` falls through to the single-gf branch, so - // `tables.len() == 1`) shared by every element. Every element must read - // that one table, so the LOOKUP wraps the BASE offset `off`, NOT `off + i`: - // `codegen::extract_table_info` derives `elem_off = off - base_off = 0` and - // `table_count == 1`, so element_offset 0 is in range for all elements. - // Wrapping `off + i` here would make `element_offset == i >= 1 == - // table_count` for i > 0, and the VM's `Lookup` opcode would push NaN. This - // is the deliberate inverse of the arrayed per-element case in - // `expand_arrayed_with_hoisting` (per-element tables, wrap `off + i`). - if lookup_only { - let exprs: Vec = SubscriptIterator::new(dims) - .enumerate() - .map(|(i, _)| { - let loc = Loc::default(); - Expr::AssignCurr( - off + i, - Box::new(Expr::App( - BuiltinFn::Lookup( - Box::new(Expr::Var(off, loc)), - Box::new(lookup_only_index_expr(loc)), - loc, - ), - loc, - )), - ) - }) - .collect(); - return Ok(exprs); - } - // Lower once using element 0's subscripts to detect the expression shape. let first_subscripts: Vec = SubscriptIterator::new(dims).next().unwrap_or_default(); let first_ctx = ctx.with_active_subscripts(active_dims.clone(), &first_subscripts); diff --git a/src/simlin-engine/src/db.rs b/src/simlin-engine/src/db.rs index 74eef73da..6dd0829d9 100644 --- a/src/simlin-engine/src/db.rs +++ b/src/simlin-engine/src/db.rs @@ -248,6 +248,41 @@ pub struct SourceVariable { pub compat: datamodel::Compat, } +/// Whether a source variable is a standalone lookup-only table: a +/// graphical-function holder with an empty-or-sentinel equation and no real +/// functional input. Such a variable is a *table indexed by an explicit input* +/// (`y = table(input)`), not a value-bearing variable -- it is excluded from the +/// runlist and produces no saved series (issue #606). This is the salsa-layer +/// twin of `crate::variable::var_is_lookup_only`, evaluated over the +/// `SourceEquation` + `SourceGraphicalFunction` representation; both delegate to +/// the shared `crate::variable::is_empty_or_sentinel` core (which also accepts +/// the legacy `"0+0"` sentinel for back-compat). +/// +/// Salsa-tracked so its `bool` output backdates: callers in tracked contexts +/// (`build_var_info` -> `model_dependency_graph`, `calc_flattened_offsets`) +/// must NOT gain a fine-grained dependency on a variable's equation TEXT, which +/// would invalidate the dependency graph on every unrelated equation edit. +#[salsa::tracked] +pub(crate) fn source_var_is_table_only(db: &dyn Db, var: SourceVariable) -> bool { + use crate::variable::is_empty_or_sentinel; + match var.equation(db) { + // Scalar / A2A: one equation string plus a variable-level gf. + SourceEquation::Scalar(s) | SourceEquation::ApplyToAll(_, s) => { + var.gf(db).is_some() && is_empty_or_sentinel(s) + } + // Arrayed: a pure per-element table holder iff it has tables (a + // variable-level or any per-element gf) and EVERY element equation (and + // the EXCEPT default, if any) is empty/sentinel. + SourceEquation::Arrayed(_, elements, default, _) => { + let has_tables = var.gf(db).is_some() || elements.iter().any(|e| e.gf.is_some()); + has_tables + && !elements.is_empty() + && elements.iter().all(|e| is_empty_or_sentinel(&e.equation)) + && default.as_deref().map(is_empty_or_sentinel).unwrap_or(true) + } + } +} + // ── Mirror types for salsa compatibility ─────────────────────────────── // // These types mirror the datamodel types but derive salsa::Update. @@ -861,6 +896,12 @@ pub struct VariableDeps { pub dt_previous_referenced_vars: BTreeSet, /// Variables referenced *only* through PREVIOUS(...) in the initial AST. pub initial_previous_referenced_vars: BTreeSet, + /// Standalone lookup tables referenced via `LOOKUP(table, x)`. These are + /// layout references (codegen needs the table's offset for its reverse-map) + /// but NOT data-flow dependencies, so they are kept out of `dt_deps` / + /// `initial_deps` (no runlist-ordering or causal/LTM edge) and reunited with + /// the dep set only by the fragment compiler's metadata/tables build (#606). + pub referenced_tables: BTreeSet, } fn canonical_module_input_set(module_input_names: &[String]) -> BTreeSet> { @@ -892,6 +933,8 @@ fn variable_direct_dependencies_impl( dt_init_only_referenced_vars: BTreeSet::new(), dt_previous_referenced_vars: BTreeSet::new(), initial_previous_referenced_vars: BTreeSet::new(), + // A module never references a lookup table via LOOKUP(...). + referenced_tables: BTreeSet::new(), } } _ => { @@ -945,6 +988,11 @@ fn variable_direct_dependencies_impl( dt_init_only_referenced_vars: dt_classification.init_only, dt_previous_referenced_vars: dt_classification.previous_only, initial_previous_referenced_vars: init_classification.previous_only, + referenced_tables: dt_classification + .referenced_tables + .into_iter() + .chain(init_classification.referenced_tables) + .collect(), } } } @@ -1276,7 +1324,7 @@ fn init_value_equivalence_group( use crate::builtins::{BuiltinContents, walk_builtin_expr}; let mut found = false; walk_builtin_expr(builtin, |c| { - if let BuiltinContents::Expr(inner) = c + if let BuiltinContents::Expr(inner) | BuiltinContents::LookupTable(inner) = c && !found { found = find_value_branches(inner, &mut *out); @@ -4362,6 +4410,10 @@ fn compile_implicit_var_phase_bytecodes( .dt_deps .iter() .chain(iv_deps.initial_deps.iter()) + // Lookup tables referenced by this implicit var are layout + // references, not data-flow deps -- include them so the fragment's + // metadata + tables map can resolve `LOOKUP(table, x)` (#606). + .chain(iv_deps.referenced_tables.iter()) .cloned() .collect() } else { @@ -5696,28 +5748,33 @@ fn calc_flattened_offsets_incremental( for ident in &sorted_names { let ident_canonical = Ident::new(ident.as_str()); - let size = - if let Some(svar) = source_vars.get(ident.as_str()) { - if svar.kind(db) == SourceVariableKind::Module { - let sub_model_name = svar.model_name(db); - let sub_offsets = - calc_flattened_offsets_incremental(db, project, sub_model_name, false); - let mut sub_var_names: Vec<&Ident> = sub_offsets.keys().collect(); - sub_var_names.sort_unstable(); - for sub_name in &sub_var_names { - let (sub_off, sub_size) = sub_offsets[*sub_name]; - offsets.insert( - Ident::join( - &ident_canonical.as_canonical_str(), - &sub_name.as_canonical_str(), - ), - (i + sub_off, sub_size), - ); - } - let sub_size: usize = sub_offsets.iter().map(|(_, (_, size))| size).sum(); - sub_size - } else { - let var_sz = variable_size(db, *svar, project); + let size = if let Some(svar) = source_vars.get(ident.as_str()) { + if svar.kind(db) == SourceVariableKind::Module { + let sub_model_name = svar.model_name(db); + let sub_offsets = + calc_flattened_offsets_incremental(db, project, sub_model_name, false); + let mut sub_var_names: Vec<&Ident> = sub_offsets.keys().collect(); + sub_var_names.sort_unstable(); + for sub_name in &sub_var_names { + let (sub_off, sub_size) = sub_offsets[*sub_name]; + offsets.insert( + Ident::join( + &ident_canonical.as_canonical_str(), + &sub_name.as_canonical_str(), + ), + (i + sub_off, sub_size), + ); + } + let sub_size: usize = sub_offsets.iter().map(|(_, (_, size))| size).sum(); + sub_size + } else { + let var_sz = variable_size(db, *svar, project); + // A lookup-only table is not a saved output variable: reserve + // its layout slot (so these offsets stay in lockstep with + // `compute_layout`, whose map codegen's table-identity + // reverse-map resolves against) but do NOT expose its name in + // this VM/Results map -- it produces no series (issue #606). + if !source_var_is_table_only(db, *svar) { if var_sz > 1 { // Array variable: produce per-element offsets let dims = variable_dimensions(db, *svar, project); @@ -5735,12 +5792,13 @@ fn calc_flattened_offsets_incremental( } else { offsets.insert(ident_canonical.clone(), (i, 1)); } - var_sz } - } else { - offsets.insert(ident_canonical.clone(), (i, 1)); - 1 - }; + var_sz + } + } else { + offsets.insert(ident_canonical.clone(), (i, 1)); + 1 + }; i += size; } diff --git a/src/simlin-engine/src/db_dep_graph.rs b/src/simlin-engine/src/db_dep_graph.rs index cf857397a..58358eab8 100644 --- a/src/simlin-engine/src/db_dep_graph.rs +++ b/src/simlin-engine/src/db_dep_graph.rs @@ -61,6 +61,10 @@ use crate::db::model_dependency_graph; pub(crate) struct VarInfo { pub(crate) is_stock: bool, pub(crate) is_module: bool, + /// A standalone lookup-only table: excluded from every runlist and from the + /// saved output, because it is a static table indexed by callers, not a + /// value-bearing variable (issue #606). + pub(crate) is_table_only: bool, pub(crate) dt_deps: BTreeSet, pub(crate) initial_deps: BTreeSet, } @@ -324,6 +328,7 @@ pub(crate) fn build_var_info( VarInfo { is_stock: kind == SourceVariableKind::Stock, is_module: kind == SourceVariableKind::Module, + is_table_only: crate::db::source_var_is_table_only(db, source_vars[name.as_str()]), dt_deps: normalize_deps(&dt_deps), initial_deps: normalize_deps(&initial_deps), }, @@ -353,6 +358,8 @@ pub(crate) fn build_var_info( VarInfo { is_stock: implicit.is_stock, is_module: implicit.is_module, + // Implicit SMOOTH/DELAY/TREND internals are never lookup tables. + is_table_only: false, dt_deps: normalize_deps(&dt_deps), initial_deps: normalize_deps(&initial_deps), }, @@ -2271,9 +2278,12 @@ pub(crate) fn model_dependency_graph_impl( var_info .get(n.as_str()) .map(|i| { - i.is_stock - || i.is_module - || (i.dt_deps.is_empty() && !i.initial_deps.is_empty()) + // A lookup-only table produces no value, so it is never + // an initials-phase variable (issue #606). + !i.is_table_only + && (i.is_stock + || i.is_module + || (i.dt_deps.is_empty() && !i.initial_deps.is_empty())) }) .unwrap_or(false) || all_init_referenced.contains(n.as_str()) @@ -2345,7 +2355,9 @@ pub(crate) fn model_dependency_graph_impl( let is_input = module_input_set.contains(canonicalize(n).as_ref()); var_info .get(n.as_str()) - .map(|i| is_input || !i.is_stock) + // A lookup-only table is a static table, not a flow: it is + // never lowered and emits no bytecode (issue #606). + .map(|i| (is_input || !i.is_stock) && !i.is_table_only) .unwrap_or(false) }) .collect(); diff --git a/src/simlin-engine/src/db_dep_graph_tests.rs b/src/simlin-engine/src/db_dep_graph_tests.rs index c30e26f54..4fc17bc30 100644 --- a/src/simlin-engine/src/db_dep_graph_tests.rs +++ b/src/simlin-engine/src/db_dep_graph_tests.rs @@ -36,6 +36,7 @@ fn vi_for_test(is_stock: bool, is_module: bool, dt_deps: &[&str]) -> VarInfo { VarInfo { is_stock, is_module, + is_table_only: false, dt_deps: dt_deps.iter().map(|s| (*s).to_string()).collect(), initial_deps: BTreeSet::new(), } @@ -135,6 +136,7 @@ fn vi_init_for_test(is_stock: bool, is_module: bool, initial_deps: &[&str]) -> V VarInfo { is_stock, is_module, + is_table_only: false, dt_deps: BTreeSet::new(), initial_deps: initial_deps.iter().map(|s| (*s).to_string()).collect(), } diff --git a/src/simlin-engine/src/db_implicit_deps.rs b/src/simlin-engine/src/db_implicit_deps.rs index 91f76d811..10f54a6ae 100644 --- a/src/simlin-engine/src/db_implicit_deps.rs +++ b/src/simlin-engine/src/db_implicit_deps.rs @@ -16,6 +16,11 @@ pub struct ImplicitVarDeps { pub dt_init_only_referenced_vars: BTreeSet, pub dt_previous_referenced_vars: BTreeSet, pub initial_previous_referenced_vars: BTreeSet, + /// Lookup tables referenced via `LOOKUP(table, x)` -- layout references, not + /// data-flow deps. Kept out of `dt_deps`/`initial_deps` (no ordering/causal + /// edge) but needed by the implicit-var fragment compiler's metadata + + /// tables map (issue #606). Mirrors `VariableDeps::referenced_tables`. + pub referenced_tables: BTreeSet, } pub(super) fn extract_implicit_var_deps( @@ -63,6 +68,8 @@ pub(super) fn extract_implicit_var_deps( dt_init_only_referenced_vars: BTreeSet::new(), dt_previous_referenced_vars: BTreeSet::new(), initial_previous_referenced_vars: BTreeSet::new(), + // A module never references a lookup table via LOOKUP(...). + referenced_tables: BTreeSet::new(), }; } @@ -115,6 +122,11 @@ pub(super) fn extract_implicit_var_deps( dt_init_only_referenced_vars: dt_classification.init_only, dt_previous_referenced_vars: dt_classification.previous_only, initial_previous_referenced_vars: init_classification.previous_only, + referenced_tables: dt_classification + .referenced_tables + .into_iter() + .chain(init_classification.referenced_tables) + .collect(), } }) .collect() diff --git a/src/simlin-engine/src/db_ltm_ir.rs b/src/simlin-engine/src/db_ltm_ir.rs index da76c6b1c..c40de04be 100644 --- a/src/simlin-engine/src/db_ltm_ir.rs +++ b/src/simlin-engine/src/db_ltm_ir.rs @@ -436,6 +436,10 @@ fn walk_all_in_expr( child_in_reducer, sites, ), + // A graphical-function table reference is static data, not a + // causal edge: emit no `from -> consumer` reference site for the + // table itself (only the index argument carries real edges). + BuiltinContents::LookupTable(_) => {} }); } Expr2::Op1(_, operand, _, _) => { diff --git a/src/simlin-engine/src/db_var_fragment.rs b/src/simlin-engine/src/db_var_fragment.rs index 40cefafdd..51461bc35 100644 --- a/src/simlin-engine/src/db_var_fragment.rs +++ b/src/simlin-engine/src/db_var_fragment.rs @@ -210,11 +210,16 @@ fn collect_var_dependencies( existing_keys.insert(Ident::new("final_time")); } - // Collect all dep names from both dt and initial deps + // Collect all dep names from both dt and initial deps, plus the lookup + // tables this variable references. A table reference is a layout reference + // (codegen needs the table's offset for the reverse-map), not a data-flow + // dependency, so it lives in `referenced_tables` rather than the dep sets + // (issue #606); the fragment metadata still needs it. let all_dep_names: BTreeSet<&String> = deps .dt_deps .iter() .chain(deps.initial_deps.iter()) + .chain(deps.referenced_tables.iter()) .collect(); // For each dep, build a dimension-only Variable for context. @@ -589,6 +594,47 @@ pub(crate) fn lower_var_fragment( // branches of `if isModuleInput(...)` remain compilable in the mini-context. let deps = variable_direct_dependencies_with_context(db, var, project, module_ident_context); + // A bare reference to a standalone lookup-only table -- the table used as a + // value rather than called via `LOOKUP(table, x)` -- has no scalar value of + // its own and is rejected (issue #606). After the table-reference / + // data-flow-dependency split (`referenced_tables`), a lookup-only table can + // ONLY reach `dt_deps` / `initial_deps` via such a bare `Var(table)` + // reference (a real call lands in `referenced_tables`), so its presence in + // the dependency sets is exactly this error. + { + let source_vars = model.variables(db); + let referenced: BTreeSet<&String> = deps + .dt_deps + .iter() + .chain(deps.initial_deps.iter()) + .collect(); + let bare_table_diags: Vec = referenced + .into_iter() + .filter_map(|dep| { + let dep_sv = source_vars.get(dep.as_str())?; + crate::db::source_var_is_table_only(db, *dep_sv).then(|| Diagnostic { + model: model.name(db).clone(), + variable: Some(var.ident(db).clone()), + error: DiagnosticError::Model(crate::common::Error::new( + crate::common::ErrorKind::Model, + crate::common::ErrorCode::LookupReferencedWithoutArgument, + Some(format!( + "'{dep}' is a lookup table and must be called with an \ + argument, e.g. {dep}(x) or LOOKUP({dep}, x)" + )), + )), + severity: DiagnosticSeverity::Error, + }) + }) + .collect(); + if !bare_table_diags.is_empty() { + return LoweredVarFragment::Fatal { + unit_diags, + fatal_diags: bare_table_diags, + }; + } + } + let project_models = project.models(db); // Lower the variable for compilation. Module-type variables need @@ -630,6 +676,7 @@ pub(crate) fn lower_var_fragment( .dt_deps .iter() .chain(deps.initial_deps.iter()) + .chain(deps.referenced_tables.iter()) .collect(); for dep_name in &dep_names { let effective = dep_name @@ -916,6 +963,7 @@ pub(crate) fn lower_var_fragment( .dt_deps .iter() .chain(deps.initial_deps.iter()) + .chain(deps.referenced_tables.iter()) .collect(); let mut extra_dep_names: Vec = Vec::new(); if var.kind(db) == SourceVariableKind::Stock { diff --git a/src/simlin-engine/src/lookup_only_tests.rs b/src/simlin-engine/src/lookup_only_tests.rs index 9b848d570..97792e23d 100644 --- a/src/simlin-engine/src/lookup_only_tests.rs +++ b/src/simlin-engine/src/lookup_only_tests.rs @@ -6,40 +6,36 @@ // This is a test orchestrator: it builds datamodel projects (data), drives the // salsa incremental compile + the bytecode VM (I/O-ish orchestration), and // asserts the resulting saved series. The pure logic under test lives in -// `compiler::Var::new` / `compiler::is_lookup_only`. +// `variable::var_is_lookup_only` / `db::source_var_is_table_only`. -//! Standalone graphical-function ("lookup-only") variable saved-value -//! semantics (#590). +//! Standalone graphical-function ("lookup-only") variable semantics (#606). //! -//! A *lookup-only* variable is one whose entire equation is an inline -//! graphical function with no functional argument (e.g. Vensim `g( (x,y)... )`, -//! imported as `equation = LOOKUP_SENTINEL` + a `gf`). These tests pin that -//! such a variable produces the same saved series across the scalar, arrayed -//! (`Equation::Arrayed`), and apply-to-all shapes, and that the fix does not -//! perturb a *applied* lookup (`out = LOOKUP(g, idx)`). +//! A *lookup-only* variable is one whose entire equation is a graphical function +//! with no functional argument (a `` with an empty equation, or the legacy +//! MDL `LOOKUP_SENTINEL` form). Such a variable is a **table indexed by an +//! explicit input** (`y = table(input)`), NOT a value-bearing variable: it is a +//! static table consulted by callers, so it has no time series of its own. //! -//! ## Phase 1 determination: the standalone index is `gf(Time)` +//! These tests pin that contract: +//! * a lookup-only variable produces **no saved series** (it is absent from +//! the results) across the scalar, apply-to-all, and arrayed shapes; +//! * a consumer that *calls* it with an argument (`LOOKUP(g, x)`) still reads +//! the table at that argument -- and two calls at different arguments are +//! independent (the table is not a single stored value); +//! * referencing a lookup-only table *bare* (no argument) is a compile error; +//! * the legacy `"0+0"` sentinel form is still recognized (back-compat). //! -//! Genuine Vensim evaluates a standalone lookup-only variable's table at the -//! current simulation time -- `gf(Time)` -- NOT at a constant `0` (the engine's -//! prior scalar behavior) and NOT as a literal `0` (the engine's prior -//! arrayed/A2A behavior). This was confirmed empirically: applied consumers of -//! C-LEARN's lookup-only tables byte-match `gf(Time)` against `Ref.vdf` (the -//! table y-values stepped by year, with clamp-to-endpoint outside the -//! x-domain), and the codebase's own `MdlEquation::Implicit` arm -//! (`mdl/convert/variables.rs`) already lowers an unspecified-input gf to -//! `equation = "TIME"`. So `index_expr = Expr::App(BuiltinFn::Time, loc)`. -//! -//! Each fixture below keys its table(s) on `x = [2000, 2001, 2002]` and runs -//! the sim over `INITIAL TIME = 2000 .. FINAL TIME = 2002` at `dt = 1`, so -//! `gf(Time)` yields the table's y-values step by step: `[y@2000, y@2001, -//! y@2002]`. The y-values are chosen distinct and element-identifying so a -//! wrong index (a constant `gf(0)` clamps below the x-domain to `y@2000`, -//! giving a flat `[y@2000, y@2000, y@2000]`) or a zeroed series (literal `0`) -//! is unambiguously detectable. +//! Each fixture keys its table(s) on `x = [2000, 2001, 2002]` and runs the sim +//! over `INITIAL TIME = 2000 .. FINAL TIME = 2002` at `dt = 1`, so a consumer's +//! `LOOKUP(g, Time)` yields the table's y-values step by step: `[y@2000, +//! y@2001, y@2002]`, with clamp-to-endpoint outside the x-domain. +use crate::common::ErrorCode; use crate::datamodel; -use crate::db::{SimlinDb, compile_project_incremental, sync_from_datamodel_incremental}; +use crate::db::{ + DiagnosticError, DiagnosticSeverity, SimlinDb, collect_all_diagnostics, + compile_project_incremental, sync_from_datamodel, sync_from_datamodel_incremental, +}; use crate::mdl::LOOKUP_SENTINEL; use crate::vm::Vm; @@ -67,7 +63,7 @@ fn year_gf(y2000: f64, y2001: f64, y2002: f64) -> datamodel::GraphicalFunction { } /// Sim window aligned to the `year_gf` x-domain: three save steps at -/// `Time = 2000, 2001, 2002` (dt = 1), so `gf(Time)` walks the table's +/// `Time = 2000, 2001, 2002` (dt = 1), so `LOOKUP(g, Time)` walks the table's /// y-values one per step. fn year_specs() -> datamodel::SimSpecs { datamodel::SimSpecs { @@ -80,12 +76,17 @@ fn year_specs() -> datamodel::SimSpecs { } } -fn aux_lookup_only(ident: &str, gf: datamodel::GraphicalFunction) -> datamodel::Variable { +/// A standalone lookup-only aux: a variable-level graphical function with the +/// canonical empty equation (no functional input). `equation_text` lets a test +/// pin the legacy `"0+0"` sentinel form instead. +fn aux_lookup_only_with_equation( + ident: &str, + equation_text: &str, + gf: datamodel::GraphicalFunction, +) -> datamodel::Variable { datamodel::Variable::Aux(datamodel::Aux { ident: ident.to_string(), - // Standalone lookup-only: the sentinel equation (no functional input) - // plus a variable-level graphical function. - equation: datamodel::Equation::Scalar(LOOKUP_SENTINEL.to_string()), + equation: datamodel::Equation::Scalar(equation_text.to_string()), documentation: String::new(), units: None, gf: Some(gf), @@ -95,6 +96,25 @@ fn aux_lookup_only(ident: &str, gf: datamodel::GraphicalFunction) -> datamodel:: }) } +/// The canonical lookup-only form: a `` with an empty equation. +fn aux_lookup_only(ident: &str, gf: datamodel::GraphicalFunction) -> datamodel::Variable { + aux_lookup_only_with_equation(ident, "", gf) +} + +/// An ordinary scalar aux with a real equation. +fn aux(ident: &str, equation: &str) -> datamodel::Variable { + datamodel::Variable::Aux(datamodel::Aux { + ident: ident.to_string(), + equation: datamodel::Equation::Scalar(equation.to_string()), + documentation: String::new(), + units: None, + gf: None, + ai_state: None, + uid: None, + compat: datamodel::Compat::default(), + }) +} + fn project_with( name: &str, dimensions: Vec, @@ -121,7 +141,8 @@ fn project_with( /// Compile + run a project via the incremental salsa path and return every /// variable's full saved series keyed by (canonicalized) name -- including -/// arrayed per-element keys `name[elem]`. +/// arrayed per-element keys `name[elem]`. A lookup-only table is NOT a saved +/// variable, so it never appears as a key here. fn run_series(project: &datamodel::Project) -> std::collections::HashMap> { let mut db = SimlinDb::default(); let sync = sync_from_datamodel_incremental(&mut db, project, None); @@ -133,6 +154,14 @@ fn run_series(project: &datamodel::Project) -> std::collections::HashMap Vec { + let db = SimlinDb::default(); + let sync = sync_from_datamodel(&db, project); + collect_all_diagnostics(&db, &sync) +} + fn series_of<'a>(series: &'a std::collections::HashMap>, key: &str) -> &'a [f64] { series .get(key) @@ -154,63 +183,180 @@ fn assert_series_eq(actual: &[f64], expected: &[f64], what: &str) { } } -/// AC1.1: a SCALAR lookup-only variable is evaluated at `gf(Time)`, producing -/// the per-year table values `[y@2000, y@2001, y@2002]` -- NOT a constant -/// `gf(0)` (which clamps below the x-domain to a flat `[y@2000, y@2000, -/// y@2000]`). +/// A standalone scalar lookup-only table produces NO saved series of its own, +/// and a consumer that calls it with an argument reads the table at that +/// argument. `out = LOOKUP(g, Time)` walks `[11, 22, 33]`; `g` itself is absent. #[test] -fn scalar_lookup_only_evaluates_at_time() { +fn scalar_lookup_only_has_no_series_consumer_reads_table() { let project = project_with( "scalar_lookup_only", vec![], - vec![aux_lookup_only("g", year_gf(11.0, 22.0, 33.0))], + vec![ + aux_lookup_only("g", year_gf(11.0, 22.0, 33.0)), + aux("out", "LOOKUP(g, Time)"), + ], + ); + + let series = run_series(&project); + assert!( + !series.contains_key("g"), + "a standalone lookup-only table is not a value-bearing variable and must \ + produce no saved series; got keys {:?}", + series.keys() + ); + assert_series_eq( + series_of(&series, "out"), + &[11.0, 22.0, 33.0], + "consumer LOOKUP(g, Time) must read the table at Time", + ); +} + +/// Two calls to the same lookup-only table at different arguments are +/// independent intermediate expressions -- there is no single stored value. +/// `gap = LOOKUP(g, Time) - LOOKUP(g, Time - 1)`: +/// @2000: gf(2000) - gf(1999 clamped to 2000) = 11 - 11 = 0 +/// @2001: gf(2001) - gf(2000) = 22 - 11 = 11 +/// @2002: gf(2002) - gf(2001) = 33 - 22 = 11 +#[test] +fn lookup_only_independent_calls_per_site() { + let project = project_with( + "lookup_only_gap", + vec![], + vec![ + aux_lookup_only("g", year_gf(11.0, 22.0, 33.0)), + aux("gap", "LOOKUP(g, Time) - LOOKUP(g, Time - 1)"), + ], + ); + + let series = run_series(&project); + assert!( + !series.contains_key("g"), + "lookup-only table must have no series" + ); + assert_series_eq( + series_of(&series, "gap"), + &[0.0, 11.0, 11.0], + "two LOOKUP calls at different indices must evaluate independently", + ); +} + +/// AC1.4 (no regression): a lookup-only table *applied* with a real argument +/// distinct from Time (`out = LOOKUP(g, idx)`, `idx = 2001`) follows that +/// argument (a flat `gf(2001) = 22`). The table `g` itself still has no series. +#[test] +fn applied_lookup_only_uses_argument_not_time() { + let project = project_with( + "applied_lookup_only", + vec![], + vec![ + aux_lookup_only("g", year_gf(11.0, 22.0, 33.0)), + aux("idx", "2001"), + aux("out", "LOOKUP(g, idx)"), + ], ); let series = run_series(&project); + assert!( + !series.contains_key("g"), + "lookup-only table must have no series" + ); + assert_series_eq( + series_of(&series, "out"), + &[22.0, 22.0, 22.0], + "applied LOOKUP(g, idx) must follow idx (gf(2001) = 22), not Time", + ); +} + +/// The legacy MDL `"0+0"` sentinel form of a lookup-only variable is still +/// recognized as a table (back-compat read-shim) and likewise produces no +/// series; its consumer still reads the table. +#[test] +fn legacy_sentinel_lookup_only_has_no_series() { + let project = project_with( + "legacy_sentinel_lookup_only", + vec![], + vec![ + aux_lookup_only_with_equation("g", LOOKUP_SENTINEL, year_gf(11.0, 22.0, 33.0)), + aux("out", "LOOKUP(g, Time)"), + ], + ); + + let series = run_series(&project); + assert!( + !series.contains_key("g"), + "a legacy '0+0'-sentinel lookup-only table must also have no series" + ); assert_series_eq( - series_of(&series, "g"), + series_of(&series, "out"), &[11.0, 22.0, 33.0], - "scalar lookup-only g must evaluate at gf(Time), not a constant gf(0) \ - (which would be a flat [11, 11, 11])", + "consumer of a legacy-sentinel lookup-only table still reads it", + ); +} + +/// An apply-to-all lookup-only variable (one shared variable-level table) +/// produces no per-element series. An `anchor` aux keeps the model non-degenerate. +#[test] +fn a2a_lookup_only_has_no_series() { + let g = datamodel::Variable::Aux(datamodel::Aux { + ident: "g".to_string(), + equation: datamodel::Equation::ApplyToAll(vec!["Dim".to_string()], String::new()), + documentation: String::new(), + units: None, + gf: Some(year_gf(7.0, 8.0, 9.0)), + ai_state: None, + uid: None, + compat: datamodel::Compat::default(), + }); + + let project = project_with( + "a2a_lookup_only", + vec![datamodel::Dimension::named( + "Dim".to_string(), + vec!["P".to_string(), "Q".to_string(), "R".to_string()], + )], + vec![g, aux("anchor", "1")], + ); + + let series = run_series(&project); + for elem in ["p", "q", "r"] { + assert!( + !series.contains_key(&format!("g[{elem}]")), + "A2A lookup-only element {elem} must have no series; got keys {:?}", + series.keys() + ); + } + assert_series_eq( + series_of(&series, "anchor"), + &[1.0, 1.0, 1.0], + "anchor present", ); } -/// AC1.2 + AC1.5: an ARRAYED (`Equation::Arrayed`) lookup-only variable, with a -/// NON-alphabetical declared element order, evaluates each element's OWN table -/// at `gf(Time)` (per-element series), NOT a literal `0`. The `elems` Vec is -/// given in alphabetical order (the order the MDL importer's sort produces) -/// while the dimension is declared `Z, A, M`, so a positional table mis-map -/// would swap the per-element series -- pinning the per-element-GF layout -/// invariant for the standalone case too. +/// An arrayed (`Equation::Arrayed`) lookup-only variable with per-element tables +/// produces no per-element series. #[test] -fn arrayed_lookup_only_non_sorted_order_evaluates_own_table_at_time() { - // Declared order Z, A, M (NOT alphabetical). Each element's table carries - // an element-identifying triple: - // Z -> [100, 200, 300], A -> [1000, 2000, 3000], M -> [10, 20, 30]. - // `Equation::Arrayed` element tuples: (element name, equation, EXCEPT - // default, per-element gf). `ElementName` is a `String` alias. +fn arrayed_lookup_only_has_no_series() { let arrayed_elements: Vec<( String, String, Option, Option, )> = vec![ - // elems Vec in ALPHABETICAL order (A, M, Z), each lookup-only. ( "A".to_string(), - LOOKUP_SENTINEL.to_string(), + String::new(), None, Some(year_gf(1000.0, 2000.0, 3000.0)), ), ( "M".to_string(), - LOOKUP_SENTINEL.to_string(), + String::new(), None, Some(year_gf(10.0, 20.0, 30.0)), ), ( "Z".to_string(), - LOOKUP_SENTINEL.to_string(), + String::new(), None, Some(year_gf(100.0, 200.0, 300.0)), ), @@ -238,105 +384,37 @@ fn arrayed_lookup_only_non_sorted_order_evaluates_own_table_at_time() { "Dim".to_string(), vec!["Z".to_string(), "A".to_string(), "M".to_string()], )], - vec![g], - ); - - let series = run_series(&project); - // Each element reads its OWN table (declared index Z=0, A=1, M=2) at - // gf(Time). A literal-0 lowering would give [0, 0, 0]; a positional - // mis-map would swap Z and A's series. - assert_series_eq( - series_of(&series, "g[z]"), - &[100.0, 200.0, 300.0], - "arrayed lookup-only element Z (declared index 0) at gf(Time)", - ); - assert_series_eq( - series_of(&series, "g[a]"), - &[1000.0, 2000.0, 3000.0], - "arrayed lookup-only element A (declared index 1) at gf(Time)", - ); - assert_series_eq( - series_of(&series, "g[m]"), - &[10.0, 20.0, 30.0], - "arrayed lookup-only element M (declared index 2) at gf(Time)", - ); -} - -/// AC1.3: an apply-to-all lookup-only variable (one variable-level table shared -/// by every element) evaluates that single shared table at `gf(Time)` for every -/// element, NOT a literal `0`. Critically, every element must read the BASE -/// table offset (table_count == 1); a per-element `off + i` would push the VM's -/// Lookup bounds check out of range for i > 0 and return NaN. -#[test] -fn a2a_lookup_only_shares_one_table_at_time() { - let g = datamodel::Variable::Aux(datamodel::Aux { - ident: "g".to_string(), - equation: datamodel::Equation::ApplyToAll( - vec!["Dim".to_string()], - LOOKUP_SENTINEL.to_string(), - ), - documentation: String::new(), - units: None, - // One variable-level GF, shared by every element. - gf: Some(year_gf(7.0, 8.0, 9.0)), - ai_state: None, - uid: None, - compat: datamodel::Compat::default(), - }); - - let project = project_with( - "a2a_lookup_only", - vec![datamodel::Dimension::named( - "Dim".to_string(), - vec!["P".to_string(), "Q".to_string(), "R".to_string()], - )], - vec![g], + vec![g, aux("anchor", "1")], ); let series = run_series(&project); - for elem in ["p", "q", "r"] { - assert_series_eq( - series_of(&series, &format!("g[{elem}]")), - &[7.0, 8.0, 9.0], - &format!( - "A2A lookup-only element {elem} must read the single shared table at \ - gf(Time) (a literal 0 -> [0,0,0]; an off+i bounds overflow -> NaN)" - ), + for elem in ["z", "a", "m"] { + assert!( + !series.contains_key(&format!("g[{elem}]")), + "arrayed lookup-only element {elem} must have no series; got keys {:?}", + series.keys() ); } + assert_series_eq( + series_of(&series, "anchor"), + &[1.0, 1.0, 1.0], + "anchor present", + ); } -/// AC1.2 (layout robustness): an ARRAYED (`Equation::Arrayed`) lookup-only -/// variable that carries a SINGLE *variable-level* gf (no per-element gfs) must -/// have every element read that one shared table at `gf(Time)`. -/// -/// This is the symmetric twin of the A2A bounds hazard. `variable.rs::build_tables` -/// only produces a per-element table layout (`tables.len() == n`) when at least -/// one element carries its OWN gf; an `Equation::Arrayed` with empty/sentinel -/// element equations and a single *variable-level* gf falls through to the -/// shared-table branch (`tables.len() == 1`), exactly like A2A. So the arrayed -/// lookup-only lowering must wrap the BASE offset (`elem_off = 0`, -/// `table_count = 1`) here, NOT `off + i` -- a per-element `off + i` would make -/// `element_offset == i >= 1 == table_count` for i > 0 and the VM's `Lookup` -/// opcode would push NaN for every element after the first. -/// -/// The MDL/XMILE importers never emit this exact shape (they attach per-element -/// gfs to an arrayed lookup), but `datamodel::Project` is a public compile input -/// (protobuf/JSON/serde/MCP/pysimlin/libsimlin all preserve it verbatim), so a -/// directly-constructed project can reach it and must compile correctly. +/// An arrayed lookup-only variable carrying a single variable-level gf (shared +/// by every element) likewise produces no per-element series. #[test] -fn arrayed_lookup_only_variable_level_gf_shares_one_table_at_time() { - // `Equation::Arrayed` with empty/sentinel element equations and NO - // per-element gfs, plus a single variable-level gf shared by every element. +fn arrayed_lookup_only_variable_level_gf_has_no_series() { let arrayed_elements: Vec<( String, String, Option, Option, )> = vec![ - ("P".to_string(), LOOKUP_SENTINEL.to_string(), None, None), - ("Q".to_string(), LOOKUP_SENTINEL.to_string(), None, None), - ("R".to_string(), LOOKUP_SENTINEL.to_string(), None, None), + ("P".to_string(), String::new(), None, None), + ("Q".to_string(), String::new(), None, None), + ("R".to_string(), String::new(), None, None), ]; let g = datamodel::Variable::Aux(datamodel::Aux { @@ -349,7 +427,6 @@ fn arrayed_lookup_only_variable_level_gf_shares_one_table_at_time() { ), documentation: String::new(), units: None, - // One variable-level GF, shared by every element (tables.len() == 1). gf: Some(year_gf(7.0, 8.0, 9.0)), ai_state: None, uid: None, @@ -362,68 +439,52 @@ fn arrayed_lookup_only_variable_level_gf_shares_one_table_at_time() { "Dim".to_string(), vec!["P".to_string(), "Q".to_string(), "R".to_string()], )], - vec![g], + vec![g, aux("anchor", "1")], ); let series = run_series(&project); for elem in ["p", "q", "r"] { - assert_series_eq( - series_of(&series, &format!("g[{elem}]")), - &[7.0, 8.0, 9.0], - &format!( - "arrayed lookup-only element {elem} with a single variable-level gf must \ - read the one shared table at gf(Time) (an off+i bounds overflow -> NaN \ - for every element after the first)" - ), + assert!( + !series.contains_key(&format!("g[{elem}]")), + "arrayed (variable-level gf) lookup-only element {elem} must have no series" ); } + assert_series_eq( + series_of(&series, "anchor"), + &[1.0, 1.0, 1.0], + "anchor present", + ); } -/// AC1.4 (no regression): a graphical-function variable that is *applied* with -/// an argument elsewhere (`out = LOOKUP(g, idx)` where `idx` is a real input -/// distinct from Time) still produces the applied value `gf(idx)`. The -/// standalone-lookup fix only changes which expression `g` itself is evaluated -/// at; it must not perturb the applied path. We assert BOTH halves on one -/// model: `out` follows `idx` (constant 2001 -> a flat `gf(2001) = 22`), while -/// the standalone `g` follows `gf(Time)` (`[11, 22, 33]`). +/// Referencing a lookup-only table *bare* -- without applying it to an argument +/// -- is a compile error: a table has no scalar value of its own. #[test] -fn applied_lookup_only_uses_argument_not_time_no_regression() { - let g = aux_lookup_only("g", year_gf(11.0, 22.0, 33.0)); - // `idx` is a real input distinct from Time: a constant 2001. So - // out = LOOKUP(g, 2001) = gf(2001) = 22 at every step. - let idx = datamodel::Variable::Aux(datamodel::Aux { - ident: "idx".to_string(), - equation: datamodel::Equation::Scalar("2001".to_string()), - documentation: String::new(), - units: None, - gf: None, - ai_state: None, - uid: None, - compat: datamodel::Compat::default(), - }); - let out = datamodel::Variable::Aux(datamodel::Aux { - ident: "out".to_string(), - equation: datamodel::Equation::Scalar("LOOKUP(g, idx)".to_string()), - documentation: String::new(), - units: None, - gf: None, - ai_state: None, - uid: None, - compat: datamodel::Compat::default(), - }); - - let project = project_with("applied_lookup_only", vec![], vec![g, idx, out]); - - let series = run_series(&project); - assert_series_eq( - series_of(&series, "out"), - &[22.0, 22.0, 22.0], - "applied LOOKUP(g, idx) must follow idx (gf(2001) = 22), not Time -- \ - the standalone fix must not perturb the applied path", +fn bare_reference_to_lookup_only_is_compile_error() { + let project = project_with( + "bare_lookup_reference", + vec![], + vec![ + aux_lookup_only("g", year_gf(11.0, 22.0, 33.0)), + // `y = g` references the table bare, with no argument. + aux("y", "g"), + ], ); - assert_series_eq( - series_of(&series, "g"), - &[11.0, 22.0, 33.0], - "standalone g still evaluates at gf(Time) even when applied elsewhere", + + let diags = diagnostics_of(&project); + let has_bare_ref_error = diags.iter().any(|d| { + d.variable.as_deref() == Some("y") + && matches!( + &d.error, + DiagnosticError::Model(crate::common::Error { + code: ErrorCode::LookupReferencedWithoutArgument, + .. + }) + ) + && d.severity == DiagnosticSeverity::Error + }); + assert!( + has_bare_ref_error, + "a bare reference to a lookup-only table must raise \ + LookupReferencedWithoutArgument for 'y'; got: {diags:?}" ); } diff --git a/src/simlin-engine/src/mdl/convert/variables.rs b/src/simlin-engine/src/mdl/convert/variables.rs index 490afd7eb..9e13ed956 100644 --- a/src/simlin-engine/src/mdl/convert/variables.rs +++ b/src/simlin-engine/src/mdl/convert/variables.rs @@ -663,8 +663,13 @@ impl<'input> ConversionContext<'input> { } Ok((eq_str, None, None)) } + // A bare lookup table imports as the canonical lookup-only form: an + // empty equation + a graphical function (no functional input). It is + // a static table indexed by callers, not a value-bearing variable + // (issue #606). The legacy `"0+0"` sentinel is still ACCEPTED on read + // (see `variable::is_empty_or_sentinel`) for already-serialized models. MdlEquation::Lookup(_, table) => Ok(( - LOOKUP_SENTINEL.to_string(), + String::new(), None, Some(self.build_graphical_function(var_name, table)), )), @@ -945,7 +950,8 @@ impl<'input> ConversionContext<'input> { } MdlEquation::Lookup(lhs, table) => { let gf = self.build_graphical_function(&lhs.name, table); - let (equation, compat) = self.make_equation(lhs, "0+0"); + // Canonical lookup-only form: empty equation + gf (issue #606). + let (equation, compat) = self.make_equation(lhs, ""); Ok((equation, compat, Some(gf))) } MdlEquation::WithLookup(lhs, input, table) => { @@ -2446,8 +2452,9 @@ x = y * 2 } #[test] - fn test_lookup_only_scalar_emits_0_plus_0() { - // A scalar variable defined only with a lookup table should have "0+0" as equation + fn test_lookup_only_scalar_emits_empty_equation() { + // A scalar variable defined only with a lookup table imports as the + // canonical lookup-only form: an empty equation + a gf (issue #606). let mdl = "x( (0,0),(1,1) ) ~ ~| \\\\\\---/// @@ -2465,7 +2472,10 @@ x = y * 2 if let Some(Variable::Aux(a)) = x { match &a.equation { Equation::Scalar(eq) => { - assert_eq!(eq, "0+0", "Lookup-only variable should have 0+0 equation"); + assert_eq!( + eq, "", + "lookup-only variable should have an empty equation (canonical form)" + ); } other => panic!("Expected Scalar equation, got {:?}", other), } @@ -2476,8 +2486,9 @@ x = y * 2 } #[test] - fn test_lookup_only_arrayed_emits_0_plus_0() { - // An arrayed variable with element-specific lookups should have "0+0" for each element + fn test_lookup_only_arrayed_emits_empty_equation() { + // An arrayed variable with element-specific lookups imports each element + // as the canonical lookup-only form: an empty equation + a gf (#606). let mdl = "DimA: a1, a2 ~ ~| x[a1]( (0,0),(1,1) ) @@ -2503,8 +2514,8 @@ x[a2]( (0,0),(2,2) ) assert_eq!(elements.len(), 2); for (_, eq, _, _) in elements { assert_eq!( - eq, "0+0", - "Lookup-only arrayed element should have 0+0 equation" + eq, "", + "lookup-only arrayed element should have an empty equation" ); } } diff --git a/src/simlin-engine/src/mdl/mod.rs b/src/simlin-engine/src/mdl/mod.rs index 0f72d1556..90bd78644 100644 --- a/src/simlin-engine/src/mdl/mod.rs +++ b/src/simlin-engine/src/mdl/mod.rs @@ -34,10 +34,17 @@ use crate::datamodel::{Project, Variable}; use convert::convert_mdl_with_data; use writer::MdlWriter; -/// Sentinel equation produced by the MDL parser for variables that are -/// pure lookup definitions (no input expression) or have an empty RHS. -/// The writer recognises this to emit native Vensim `name(body)` syntax -/// instead of `name = WITH LOOKUP(input, body)`. +/// Sentinel equation `"0+0"` used by the MDL converter for a variable with an +/// empty RHS (`MdlEquation::EmptyRhs`) -- a defined-but-unspecified variable +/// that should evaluate to 0 rather than error. +/// +/// A bare lookup definition (`MdlEquation::Lookup`) now imports as the canonical +/// lookup-only form -- an EMPTY equation + a graphical function -- NOT this +/// sentinel (issue #606). The sentinel is still ACCEPTED on read as a +/// lookup-only marker (`variable::is_empty_or_sentinel`) for models already +/// serialized with it. The writer recognises both the empty and sentinel forms +/// to emit native Vensim `name(body)` syntax instead of +/// `name = WITH LOOKUP(input, body)`. pub(crate) const LOOKUP_SENTINEL: &str = "0+0"; /// Convert a Project to Vensim MDL text. diff --git a/src/simlin-engine/src/mdl/writer.rs b/src/simlin-engine/src/mdl/writer.rs index 854b50b2d..ae3594612 100644 --- a/src/simlin-engine/src/mdl/writer.rs +++ b/src/simlin-engine/src/mdl/writer.rs @@ -819,18 +819,14 @@ fn write_lookup(buf: &mut String, gf: &GraphicalFunction) { buf.push(')'); } -/// Returns true when the equation text is a placeholder sentinel rather -/// than a real input expression (standalone lookup definition). -/// -/// The MDL parser produces [`LOOKUP_SENTINEL`](super::LOOKUP_SENTINEL) -/// when a variable is defined as a pure lookup (no input expression) -- -/// see `MdlEquation::Lookup` in `convert/variables.rs`. Vensim's native -/// representation is `name(body)` rather than `name = WITH LOOKUP(input, -/// body)`. An empty string covers XMILE variables that have a graphical -/// function but no equation. +/// Returns true when the equation text is the canonical empty lookup-only form +/// (a graphical function with no input expression) or the legacy `"0+0"` +/// sentinel. Such a variable is a standalone lookup definition, written in +/// Vensim's native `name(body)` syntax rather than `name = WITH LOOKUP(input, +/// body)`. Delegates to the shared predicate so the writer, the MDL converter, +/// and the compiler all agree on what counts as lookup-only (issue #606). fn is_lookup_only_equation(eqn: &str) -> bool { - let trimmed = eqn.trim(); - trimmed.is_empty() || trimmed == super::LOOKUP_SENTINEL + crate::variable::is_empty_or_sentinel(eqn) } /// Format f64 for MDL output: omit trailing `.0` for whole numbers. diff --git a/src/simlin-engine/src/variable.rs b/src/simlin-engine/src/variable.rs index 1bd00b132..0d133c704 100644 --- a/src/simlin-engine/src/variable.rs +++ b/src/simlin-engine/src/variable.rs @@ -427,6 +427,112 @@ fn build_tables( (tables, errors) } +/// Pure predicate: a "lookup-only" variable is a standalone graphical-function +/// holder -- it carries a graphical function (`has_tables`) and its equation is +/// empty or the MDL lookup sentinel (i.e. there is no functional input +/// expression to feed the table). Such a variable is a *table indexed by an +/// explicit input* (`y = table(input)`), not a runtime value-bearing variable: +/// it produces no simulation series (see `Variable::Var::is_table_only`). +/// +/// Mirrors `mdl::writer::is_lookup_only_equation`'s "empty or sentinel" rule so +/// that BOTH the canonical empty-equation form (now emitted by both the XMILE +/// and MDL importers) and a legacy MDL-sourced one (the `LOOKUP_SENTINEL` +/// equation alongside a `gf`) are detected. New imports emit the empty form; the +/// `"0+0"` arm is a back-compat read-shim for models already serialized with it. +/// +/// Returns `false` for WITH LOOKUP (`var = WITH LOOKUP(input, table)`: tables +/// present but a *real* input equation) and for an ordinary aux (a real +/// equation, no tables). The `has_tables` guard also keeps an empty-RHS aux (no +/// gf) off this path. +pub(crate) fn is_lookup_only(equation: &str, has_tables: bool) -> bool { + has_tables && is_empty_or_sentinel(equation) +} + +/// An equation string with no functional content: empty/whitespace-only (the +/// canonical lookup-only form) or exactly the MDL lookup sentinel `"0+0"` (a +/// legacy back-compat read-shim -- older serialized models may still carry it). +/// The trimmed comparison mirrors `mdl::writer::is_lookup_only_equation`. +pub(crate) fn is_empty_or_sentinel(equation: &str) -> bool { + let trimmed = equation.trim(); + trimmed.is_empty() || trimmed == crate::mdl::LOOKUP_SENTINEL +} + +/// Whether a whole variable is lookup-only, given its source `datamodel::Equation` +/// and whether it carries graphical functions (`has_tables`). A scalar/A2A +/// variable is lookup-only when its single equation is empty/sentinel; an +/// arrayed (`Equation::Arrayed`) variable is lookup-only when EVERY element +/// equation (and the EXCEPT default, if any) is empty/sentinel -- i.e. it is a +/// pure per-element table holder. Returns `false` when the variable has no +/// equation or no tables. +pub(crate) fn var_is_lookup_only(eqn: Option<&datamodel::Equation>, has_tables: bool) -> bool { + use crate::datamodel::Equation; + if !has_tables { + return false; + } + match eqn { + // Scalar / A2A: one equation string -- the pure `is_lookup_only` atom. + Some(Equation::Scalar(s)) | Some(Equation::ApplyToAll(_, s)) => { + is_lookup_only(s, has_tables) + } + // Arrayed: a pure per-element table holder iff EVERY element equation + // (and the EXCEPT default, if any) is empty/sentinel. `has_tables` is + // already true here, so the per-element check is just the string + // predicate `is_empty_or_sentinel` that the atom is built on. + Some(Equation::Arrayed(_, elements, default, _)) => { + !elements.is_empty() + && elements.iter().all(|(_, e, _, _)| is_empty_or_sentinel(e)) + && default.as_deref().map(is_empty_or_sentinel).unwrap_or(true) + } + None => false, + } +} + +#[cfg(test)] +mod is_lookup_only_tests { + use super::*; + + #[test] + fn is_lookup_only_sentinel_equation_with_tables_is_true() { + // Legacy MDL-sourced lookup-only: `g(
)` serialized with equation + // == LOOKUP_SENTINEL ("0+0") + a graphical function. Still detected as a + // back-compat read-shim even though new imports emit the empty form. + assert!(is_lookup_only(crate::mdl::LOOKUP_SENTINEL, true)); + } + + #[test] + fn is_lookup_only_empty_equation_with_tables_is_true() { + // Canonical lookup-only: a variable with a `` but no equation. + assert!(is_lookup_only("", true)); + // Whitespace-only is equivalent to empty (matches the `.trim()` in + // `mdl::writer::is_lookup_only_equation`). + assert!(is_lookup_only(" ", true)); + } + + #[test] + fn is_lookup_only_with_lookup_real_input_is_false() { + // WITH LOOKUP (`g = WITH LOOKUP(some_input,
)`): tables present + // but a real input expression. Must NOT be treated as lookup-only -- + // it is a genuine value-bearing variable that evaluates `gf(input)`. + assert!(!is_lookup_only("some_input", true)); + } + + #[test] + fn is_lookup_only_ordinary_aux_no_tables_is_false() { + // An ordinary aux: a real equation and no graphical function. + assert!(!is_lookup_only("3 * x + 1", false)); + } + + #[test] + fn is_lookup_only_sentinel_equation_without_tables_is_false() { + // An empty-RHS aux (MdlEquation::EmptyRhs): the sentinel equation but + // NO graphical function. `has_tables == false` keeps it off the + // lookup-only path even though its equation text matches the sentinel. + assert!(!is_lookup_only(crate::mdl::LOOKUP_SENTINEL, false)); + // The empty-equation, no-tables case is likewise not lookup-only. + assert!(!is_lookup_only("", false)); + } +} + fn get_dimensions( dimensions: &[datamodel::Dimension], names: &[DimensionName], @@ -623,7 +729,22 @@ where } } None => { - if errors.is_empty() && !is_initial && !v.can_be_module_input() { + // An empty equation is only an error when the variable has no + // graphical function. A standalone lookup-only table (empty + // equation + a ``) is a valid static table, not an error + // (issue #606): it produces no series and is excluded from the + // runlist downstream. (WITH LOOKUP has a real input equation, so + // it never reaches this empty-equation branch.) + let is_lookup_only_table = matches!( + v, + datamodel::Variable::Aux(datamodel::Aux { gf: Some(_), .. }) + | datamodel::Variable::Flow(datamodel::Flow { gf: Some(_), .. }) + ); + if errors.is_empty() + && !is_initial + && !v.can_be_module_input() + && !is_lookup_only_table + { errors.push(EquationError { start: 0, end: 0, @@ -689,6 +810,10 @@ where }; let (tables, table_errors) = build_tables(&v.gf, &v.equation, dimensions); errors.extend(table_errors); + // A standalone graphical-function holder (a ``/MDL lookup with an + // empty-or-sentinel equation) is a static table, not a value-bearing + // variable: it is excluded from the runlist and produces no series. + let is_table_only = var_is_lookup_only(Some(&v.equation), !tables.is_empty()); Variable::Var { ident, ast, @@ -697,7 +822,7 @@ where units, tables, is_flow: true, - is_table_only: false, + is_table_only, non_negative: v.compat.non_negative, errors, unit_errors, @@ -727,6 +852,10 @@ where }; let (tables, table_errors) = build_tables(&v.gf, &v.equation, dimensions); errors.extend(table_errors); + // A standalone graphical-function holder (a ``/MDL lookup with an + // empty-or-sentinel equation) is a static table, not a value-bearing + // variable: it is excluded from the runlist and produces no series. + let is_table_only = var_is_lookup_only(Some(&v.equation), !tables.is_empty()); Variable::Var { ident, ast, @@ -735,7 +864,7 @@ where units, tables, is_flow: false, - is_table_only: false, + is_table_only, non_negative: false, errors, unit_errors, @@ -784,6 +913,13 @@ pub struct DepClassification { pub previous_only: BTreeSet, /// Idents referenced ONLY inside INIT() or PREVIOUS() -- not outside either. pub init_only: BTreeSet, + /// Standalone lookup tables referenced via `LOOKUP(table, x)`. A table + /// reference is a *layout* reference (codegen needs the table variable's + /// offset for the table-identity reverse-map), NOT a data-flow dependency: + /// it is kept OUT of `all` so it never creates a runlist-ordering or + /// causal/LTM edge, and is reunited with the dependency set only when the + /// fragment compiler builds its metadata + tables map (issue #606). + pub referenced_tables: BTreeSet, } /// Unified AST walker that computes all dependency categories in a single pass. @@ -812,6 +948,7 @@ struct ClassifyVisitor<'a> { previous_referenced: BTreeSet, non_previous: BTreeSet, non_init: BTreeSet, + referenced_tables: BTreeSet, dimensions: &'a [Dimension], module_inputs: Option<&'a BTreeSet>>, in_previous: bool, @@ -918,6 +1055,20 @@ impl ClassifyVisitor<'_> { self.all.insert(Ident::new(id)); } BuiltinContents::Expr(expr) => self.walk(expr), + // A graphical-function table reference is a *layout* + // reference, not a data-flow dependency: record it in + // `referenced_tables` (so the fragment compiler can find + // the table's offset for the reverse-map) WITHOUT adding + // it to `all`, keeping it off the runlist-ordering and + // causal/LTM graphs (issue #606). A *bare* reference to + // such a table is a plain `Var`, not a `LookupTable`, and + // is rejected separately as a compile error. + BuiltinContents::LookupTable(table_expr) => { + if let Expr2::Var(id, _, _) | Expr2::Subscript(id, _, _, _) = table_expr + { + self.referenced_tables.insert(id.to_string()); + } + } }); } }, @@ -979,6 +1130,7 @@ pub fn classify_dependencies( previous_referenced: BTreeSet::new(), non_previous: BTreeSet::new(), non_init: BTreeSet::new(), + referenced_tables: BTreeSet::new(), dimensions, module_inputs, in_previous: false, @@ -1012,6 +1164,7 @@ pub fn classify_dependencies( previous_referenced: visitor.previous_referenced, previous_only, init_only, + referenced_tables: visitor.referenced_tables, } } diff --git a/src/simlin-engine/tests/simulate.rs b/src/simlin-engine/tests/simulate.rs index d913e9116..2d9ada4ff 100644 --- a/src/simlin-engine/tests/simulate.rs +++ b/src/simlin-engine/tests/simulate.rs @@ -1752,39 +1752,22 @@ const EXPECTED_VDF_RESIDUAL: &[&str] = &[ // value on a meaningful magnitude. (#595 tracks the remaining deeper init- // ordering soundness gap; its nondeterminism half is fixed by `e24b0080`.) - // ===== lookup-only tables the model-free reader cannot drop (9) ===== + // ===== lookup-only tables (0): NONE ===== // A bare graphical function is a *table indexed by an explicit input*, not a - // time series. The reader therefore DROPS standalone lookup-only descriptors - // rather than reconstruct a series for them - // (`record_results::standalone_lookup_only_descriptors`); most of C-LEARN's - // lookup-only variables -- `historical_gdp_lookup`, `historical_forestry_lookup`, - // `rs_gdp_in_trillions`, `rs_ch4`, `rs_co2_ff`, the ozone/forcings scalars -- - // are now dropped and gone from this list (their data, where it matters, is - // carried by the consumer variables that call them, which are emitted as - // ordinary owners). The 9 below are lookup-only tables the model-free reader - // cannot safely distinguish from a real owner, so it still emits a ghost - // column the comparator flags: - // - rs_hfc* (8): the descriptor forward-links to the WIDER 2-D consumer - // `RS HFC[COP, HFC type]` (forward width 63 != the descriptor's 7), so the - // conservative width gate declines to drop it. - // - ref_global_emissions_from_graph_lookup: its forward link is Time/0, so - // the forward-link guard declines to drop it. - // The deeper bug is in the ENGINE: a bare lookup is a table, and lowering it - // to `gf(Time)` synthesises a phantom series that the comparator then flags - // against these ghosts. Fixing that (so the engine produces no series for a - // bare lookup) removes them from the matched set entirely -- tracked by #606 - // (the engine `gf(Time)` lowering, introduced for #590). #597 is the general - // reader fix (drop lookup-only descriptors); these nine evade it only because - // the model-free reader can't safely distinguish them from real owners. - "rs_hfc125", - "rs_hfc134a", - "rs_hfc143a", - "rs_hfc152a", - "rs_hfc227ea", - "rs_hfc23", - "rs_hfc245ca", - "rs_hfc32", - "ref_global_emissions_from_graph_lookup", + // time series. The engine no longer synthesises a phantom `gf(Time)` series + // for a standalone lookup-only table: such a table is not a value-bearing + // variable, so it is excluded from the runlist and from the saved output + // (#606 RESOLVED; this was the `gf(Time)` lowering introduced for #590). The + // VDF reader still emits a ghost column for the few descriptors it cannot + // model-free distinguish from a real owner -- the `rs_hfc*` family (whose + // descriptor forward-links to the wider 2-D consumer `RS HFC[COP, HFC type]`) + // and `ref_global_emissions_from_graph_lookup` (forward link Time/0) -- but + // since the engine now produces NO sim series for those idents, the + // comparator simply skips them (a VDF ident with no sim counterpart is not a + // mismatch), so NONE remain in this carve-out. Their data, where it matters, + // is carried by the consumer variables that call them, emitted as ordinary + // owners and matched normally. + // ===== benign-near-zero (2): divergence ONLY on near-zero magnitudes ===== // (cross-simulator f32/integration noise), never on a meaningful value. // Tracked in #591 cluster 2. @@ -1830,14 +1813,13 @@ const EXPECTED_VDF_RESIDUAL: &[&str] = &[ // `DoesNotExist` blockers (incl. the `"goal 1.5 for temperature"` quoted- // period ident, #559) are likewise cleared. // -// What remains is a short, fully-attributed residual of 13 base variables -// (Phases 1-3 reconciled the rest; the reader now DROPS standalone lookup-only -// descriptors -- bare graphical-function tables, not series -- so most of them -// leave the comparison entirely), explicitly excluded via `EXPECTED_VDF_RESIDUAL` -// under a taxonomy: 9 lookup-only tables the model-free reader cannot safely -// distinguish from a real owner and so still emits a ghost column for (the -// engine's `gf(Time)` lowering of a bare lookup is the deeper bug, tracked by #606); -// 2 benign-near-zero; and 2 `:NA:`-arithmetic boundary series (#591). The engine +// What remains is a short, fully-attributed residual of 4 base variables +// (Phases 1-3 reconciled the rest; the reader DROPS standalone lookup-only +// descriptors -- bare graphical-function tables, not series -- and the ENGINE +// now produces no series at all for a lookup-only table (#606 RESOLVED), so +// every lookup-only ident leaves the comparison entirely), explicitly excluded +// via `EXPECTED_VDF_RESIDUAL` under a taxonomy: 2 benign-near-zero; and 2 +// `:NA:`-arithmetic boundary series (#591). The engine // is PROVEN correct on all of them; the engine-genuine and NaN-vs-`:NA:` // categories are empty (the sole engine-genuine divergence, init-runlist // nondeterminism, was fixed in `e24b0080`). The exclusion is a transparent,