engine: reconcile mid-run getValue across VM and wasm (#625) by bpowers · Pull Request #632 · bpowers/simlin

bpowers · 2026-05-23T05:09:55Z

Summary

Closes #625. After a runTo(t) that stops mid-interval, the bytecode VM and the wasm blob left different content in their live curr chunk for non-stock variables, so a mid-run getValue(name) agreed between backends only for stocks + reserved time vars. Verified still live on main (post-#630) and actually worse than the issue described -- the VM returns 0 for constants mid-run:

variable	kind	VM `getValue`	wasm `getValue`
`teacup_temperature`	stock	(agree)	(agree)
`room_temperature`	constant	0	70
`characteristic_time`	constant	0	10
`heat_loss_to_room`	flow	0	2.43… (one `dt` behind)

Per the chosen resolution (reconcile, not document), this makes mid-run getValue self-consistent and identical across backends for every variable.

Root cause

Both integration loops break on curr[TIME] > target after the save+advance tail. The advance steps time and integrates stocks but does not recompute flow/aux/constants for the advanced time:

VM: Euler's wholesale curr.copy_from_slice(next) (and the chunk-ring advance) lands curr on a chunk whose non-stock slots are stale -- 0 for a constant the overshoot chunk never held.
wasm: emit_save_advance copies only the stock offsets next -> curr, leaving flow/aux one step behind the advanced time + stocks.

So curr's stocks/time are at the resting point but its non-stock slots are not -- internally inconsistent on both, and divergent between them.

Fix

After the run_to loop, re-evaluate root flows once at the resting curr on both backends (vm.rs reuses the same StepPart::Flows eval the RK paths already use; wasmgen/module.rs emits one flows(0) call after the loop). The live curr chunk becomes fully self-consistent -- "the value at the current time" -- and byte-identical VM-vs-wasm for every variable.

Well-scoped, by construction:

Saved series unaffected: every results row was already committed; the VM's get_series reads chunks [0, curr_chunk), excluding the resting chunk.
Full run unaffected: runToEnd reads the last results row, and its break path already leaves a freshly-evaluated curr, so the re-eval is idempotent (a resumed run_to on a full slab stays the engine: move wasm sim live-state ownership into the blob #630 no-op).
Resume unaffected: a resumed run_to re-evaluates from scratch, and the re-eval does not re-snapshot prev_values, so PREVIOUS still sees the last completed step.

Applies uniformly to Euler/RK2/RK4. No TS/host change -- the host just reads curr.

Tests (TDD)

wasmgen/module.rs: mid_run_curr_is_self_consistent_and_matches_vm -- a flow-phase aux doubled = level * 2 exposes the one-step lag; asserts every var's mid-run curr equals the VM's get_value_now, and that doubled == level * 2 at the resting time (20, not the lagged 16). Watched fail first (inflow_rate wasm=2 vs vm=0).
wasm-backend.test.ts: AC2.2 widened from stocks+time to every variable. Watched fail first (constant diff of 10) against the pre-fix wasm, green after the rebuild.

Full pre-commit gate passes: cargo test --workspace (incl. the always-on corpus wasm_parity_hook), pnpm -C src/engine test (412), clippy, tsc, pysimlin.

After a runTo(t) that stops mid-interval, both executors' integration loops break on curr[TIME] > target *after* an advance, so the live curr chunk held the resting-point stocks and time but stale non-stock slots: the VM's wholesale curr.copy_from_slice(next) (and chunk-ring advance) left flow/aux/constants stale -- a mid-run getValue of a constant read 0 -- and the wasm advance copies only the stock offsets, leaving flow/aux one step behind. A mid-run getValue therefore agreed between backends only for stocks and the reserved time vars. Re-evaluate root flows once at the resting curr after the loop, on both backends, so the live curr chunk is fully self-consistent ("the value at the current time") and identical VM-vs-wasm for every variable. The re-eval touches only the live curr chunk: every results row was already saved (and the VM's get_series reads chunks [0, curr_chunk), excluding this one), a resumed run_to re-evaluates from scratch, and a full runToEnd reads the last results row -- so saved series, resume, and full-run getValue are unchanged. It does not re-snapshot prev_values, so a resumed run's PREVIOUS still sees the last completed step. The full-run and reset break paths already leave a freshly-evaluated curr, so the re-eval is idempotent there (a resumed run_to on a full slab stays a no-op). Tests: a Rust mid-run parity + self-consistency test (a flow-phase aux doubled = level*2 exposes the one-step lag), and the TS AC2.2 parity test widened from stocks+time to every variable.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 82193345f7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-23T05:17:26Z

+        // to resume, and to a full run. It does NOT re-snapshot `prev_values`, so a
+        // resume's `PREVIOUS` still sees the last completed step.
+        {
+            let (curr, next) = borrow_two(&mut data, n_slots, self.curr_chunk, self.next_chunk);


Skip post-loop re-eval when curr and next chunks alias

The new unconditional post-loop borrow_two call can panic when run_to(end) is invoked with end beyond the simulation horizon (a supported clamp case): in save_advance!, the chunk-exhaustion branch sets curr_chunk = next_chunk and breaks before advancing next_chunk, so borrow_two is later called with equal indices and slices out of bounds. This turns a previously successful run_to(very_large_time) into a runtime crash in the VM path.

Useful? React with 👍 / 👎.

claude · 2026-05-23T05:18:40Z

Review: mid-run getValue parity (#625)

Verdict: not correct — the new VM post-loop re-eval panics when run_to exits via chunk-ring exhaustion.

[P1] run_to past FINAL_TIME now panics in the VM (borrow_two aliases the same chunk)

src/simlin-engine/src/vm.rs:870-882

The new re-eval block calls borrow_two(&mut data, n_slots, self.curr_chunk, self.next_chunk) unconditionally after the loop. When the loop exits via the chunk-ring exhaustion break in save_advance! (vm.rs:699-705), that branch runs self.curr_chunk = self.next_chunk before break, so it leaves curr_chunk == next_chunk == n_chunks+1. borrow_two with equal indices computes split = (n_chunks+1)*n_slots and then slices left[(n_chunks+1)*n_slots..(n_chunks+2)*n_slots] out of a left of length (n_chunks+1)*n_slots, panicking with "range end index out of range." This is reachable whenever run_to(target) is called with target past FINAL_TIME (e.g. the FFI simlin_sim_run_to forwards time unclamped, and the wasm side's own comment at module.rs:1168-1205 documents "a target past FINAL_TIME" / "interactive scrubbing that stays at the end" as a supported case it clamps to the slab end "exactly like the VM's chunk-ring exhaustion"). Before this commit that path returned Ok gracefully; now the VM aborts while the wasm backend (guarded by saved >= n_chunks and a fixed CURR_BASE) handles it — a new VM/wasm divergence and a panic across the C FFI boundary. Guarding the re-eval on curr_chunk != next_chunk (or otherwise handling the exhausted-slab case) restores the prior behavior.

Minor: the "idempotent for the full-run/run/reset paths because they break with a freshly-evaluated curr already" note (vm.rs:867-869, module.rs:1252-1253) is imprecise: on a full run_to(stop) the resting curr is the chunk that was only ever written as next (stocks + time only), so its flow/aux slots are stale and the re-eval does change them — it is not a no-op there (only a resumed run_to on a full slab is). The observable series is still fine, but the comment's premise does not hold.

codecov · 2026-05-23T06:29:25Z

Codecov Report

❌ Patch coverage is 70.83333% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.57%. Comparing base (d1b2492) to head (03ea843).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/simlin-engine/src/vm.rs	22.22%	7 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #632      +/-   ##
==========================================
- Coverage   83.58%   83.57%   -0.02%     
==========================================
  Files         275      276       +1     
  Lines       75395    75419      +24     
==========================================
+ Hits        63019    63029      +10     
- Misses      12376    12390      +14

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Both PR #632 reviewers flagged a panic: the post-loop root-flow re-eval added for #625 called borrow_two(curr_chunk, next_chunk) unconditionally, but run_to(target) with target past FINAL_TIME exits the loop via the chunk-ring exhaustion break in save_advance!, which sets curr_chunk = next_chunk before breaking. borrow_two with two equal chunk indices slices left[a*n_slots..(a+1)*n_slots] out of a left of length a*n_slots and panics -- aborting across the FFI boundary on what was previously a graceful clamp (e.g. an unclamped simlin_sim_run_to). Guard the re-eval on curr_chunk != next_chunk. The exhausted-slab case is exactly the one a mid-interval read never reaches (a full slab means time has reached FINAL_TIME, not mid-interval), so skipping the re-eval there is correct and restores the pre-#625 graceful behavior. Every normal run_to (mid-interval included) breaks via the curr[TIME] > end path with distinct chunks, so the re-eval still runs and #625 stays fixed. Also correct the imprecise "idempotent for the full-run path" note: on the VM a full run leaves curr on the next-written overshoot chunk (stale flow/aux), so the re-eval does change it there -- but that chunk is beyond the saved results, so it stays invisible to a full run. The wasm note is tightened too (it does break with a consistent curr, and has no chunk aliasing since curr is the fixed CURR_BASE region). Adds a VM regression test (run_to 10x past the end returns Ok, not panic). The new test pushed vm.rs over the 6000-line file lint, so the vm_reset_and_run_initials_tests module is extracted to a sibling file included via #[path] (the existing vm_set_value_tests.rs pattern), so use super::* still resolves vm's private items. The residual VM-vs-wasm divergence for run_to(past FINAL_TIME) (VM overshoots the slab; wasm clamps) is tracked as #634 -- not reachable via in-range scrubbing.

bpowers · 2026-05-23T13:49:47Z

@codex review

chatgpt-codex-connector · 2026-05-23T13:59:31Z

Codex Review: Didn't find any major issues. 🚀

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

claude · 2026-05-23T14:02:03Z

Code review

I traced both backends' break paths and the getValue read paths. The in-range scrubbing fix (partial run_to(target < stop)) is correct and consistent across VM and wasm, including for PREVIOUS auxes. The test-file split is complete (all 17 original tests preserved + the new panic regression). One issue stands out.

[P2] wasm post-loop `flows(0)` makes `PREVIOUS` auxes stale-by-one after a full run, diverging from the VM and from `getSeries`

src/simlin-engine/src/wasmgen/module.rs:1258-1259

The post-loop flows(0) re-eval is unconditional, so it also runs after the slab-exhaustion break that run_to_end (and any run_to(t >= stop)) takes. In that break curr was not advanced — it is the just-stepped row at t = stop, and prev_values was already snapshotted to that same row (the per-step snapshot in emit_prev_snapshot runs after the step's flows). The re-eval therefore resolves PREVIOUS(x) against curr's own snapshot, yielding x(stop) instead of x(stop - dt). Since the wasm engine's simGetValue reads the live curr (direct-backend.ts:611), getValue of a PREVIOUS-using variable after a full run returns the wrong timestep and diverges from the VM (whose simGetValue reads the last committed results row once run_to_end consumes the VM, simulation.rs:427-430) and from the wasm backend's own getSeries (committed row = x(stop - dt)). The comment's "the re-eval is idempotent there" holds only for non-PREVIOUS slots; for PREVIOUS it is not. This is pre-existing-correct behavior the PR regresses, but only in this combination: a model with a PREVIOUS() variable, the wasm engine, and a getValue read after a full/at-stop run — which is why the teacup-based parity tests (no PREVIOUS aux) and the new mid-interval test don't catch it. Guarding the re-eval on saved < n_chunks (skip on a full slab) would mirror the VM's curr_chunk != next_chunk guard and fix it while still re-evaluating the in-range scrubbing case.

Overall correctness

The patch is correct for its stated target (in-range mid-interval getValue parity) and the chunk-ring panic guard is sound. The one finding above is a narrow but genuine wasm-only divergence for PREVIOUS variables read via getValue after a full run, contradicting the PR's parity goal; I'd consider the patch not fully correct until that case is guarded or explicitly folded into the tracked #634 divergence.

PR #632 review (Claude) found a wasm-only regression: the post-loop flows(0) re-eval added for #625 ran unconditionally, including after a full / at-stop run. A full run breaks via the slab-exhaustion path, which does NOT advance curr -- curr is the just-saved t=stop row, and prev_values was already snapshotted to that same row (the per-step snapshot runs after the step's flows). The re-eval then resolved PREVIOUS(x) against curr's own snapshot, yielding x(stop) instead of x(stop-dt), corrupting the live curr that the host's getValue reads. So getValue of a PREVIOUS-using variable after a full run diverged from the committed series and from the VM (which reads the last results row). Guard the re-eval on saved < n_chunks: skip it exactly on the break paths that do not advance curr (the full-slab exhaustion break and the resumed full-slab top guard), where curr is already the correct last-saved row. This mirrors the VM's curr_chunk != next_chunk guard ("re-eval only when curr was advanced"). The mid-interval case (saved < n_chunks, curr advanced) still re-evals, and there prev_values holds the genuinely previous step, so PREVIOUS resolves correctly -- #625 stays fixed, including for PREVIOUS auxes. Skipping also makes a resumed run_to on a full slab a strict no-op rather than an idempotent re-eval. The VM needs no change here: its full run advances curr (it breaks via curr[TIME] > end after the advance, distinct chunks), so its re-eval reads a genuine previous snapshot, and getValue after run_to_end reads the results row regardless. Tests: a new full-run regression (a PREVIOUS aux's live curr must equal the last committed series row and the VM, = level at t-dt, not t), and the mid-run parity test's fixture gains a PREVIOUS aux so the all-variable mid-interval comparison covers it too.

bpowers · 2026-05-23T14:13:11Z

@codex review

chatgpt-codex-connector · 2026-05-23T14:20:58Z

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

claude · 2026-05-23T14:24:39Z

Review (Claude)

Reviewed the mid-run getValue reconciliation. I traced both backends' integration-loop break paths and confirmed the two guards are correct.

VM (vm.rs:879): curr_chunk != next_chunk correctly skips the post-loop flow re-eval on the chunk-ring exhaustion break, where save_advance! sets curr_chunk = next_chunk and borrow_two with two equal indices would slice left[a*n..(a+1)*n] out of a left of length a*n and panic. Verified run_to(target > stop) reaches that exhaustion break (aliased chunks) rather than the curr[TIME] > end break — covered by the new run_to_past_final_time_does_not_panic.
wasm (wasmgen/module.rs:1270): saved < n_chunks skips the flows(0) re-eval exactly on the two slab-full break paths (post-save exhaustion + top-of-loop guard), where curr is the just-saved t=stop row and prev_values was already snapshotted to that same row, so a re-eval would resolve PREVIOUS(x) to x(stop) instead of x(stop-dt). The mid-interval path (curr advanced, prev_values = last completed step) still re-evals, so PREVIOUS resolves to x(t-dt) correctly. The guard mirrors the VM's "re-eval only when curr was advanced".

Confirmed full-run getValue parity is unaffected: simlin_sim_run_to_end consumes the VM into results (so the host reads the last results row, not the re-eval'd overshoot chunk), while a mid-run run_to leaves the VM live and reads the re-eval'd resting chunk. The wasm host reads its live curr (the last saved row after a full run, the resting chunk mid-run). The new mid_run_curr_is_self_consistent_and_matches_vm and full_run_previous_aux_curr_matches_series_and_vm tests pin both regimes, including a PREVIOUS aux.

The residual run_to(past FINAL_TIME) VM-vs-wasm divergence (VM overshoots the slab via get_value_now; wasm clamps) is pre-existing for stocks/time and is tracked as #634.

Overall correctness: correct. No blocking issues found.

chatgpt-codex-connector Bot reviewed May 23, 2026

View reviewed changes

bpowers mentioned this pull request May 23, 2026

engine: VM vs wasm diverge on getValue after run_to(target) past FINAL_TIME #634

Open

bpowers merged commit 5106a28 into main May 23, 2026
13 checks passed

bpowers deleted the engine-midrun-getvalue-parity branch May 23, 2026 14:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

engine: reconcile mid-run getValue across VM and wasm (#625)#632

engine: reconcile mid-run getValue across VM and wasm (#625)#632
bpowers merged 3 commits into
mainfrom
engine-midrun-getvalue-parity

bpowers commented May 23, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 23, 2026

Uh oh!

claude Bot commented May 23, 2026

Uh oh!

codecov Bot commented May 23, 2026 •

edited

Loading

Uh oh!

bpowers commented May 23, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 23, 2026

Uh oh!

claude Bot commented May 23, 2026

Uh oh!

bpowers commented May 23, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 23, 2026

Uh oh!

claude Bot commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bpowers commented May 23, 2026

Summary

Root cause

Fix

Tests (TDD)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented May 23, 2026

Uh oh!

codecov Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

bpowers commented May 23, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 23, 2026

Uh oh!

claude Bot commented May 23, 2026

Code review

[P2] wasm post-loop flows(0) makes PREVIOUS auxes stale-by-one after a full run, diverging from the VM and from getSeries

Overall correctness

Uh oh!

bpowers commented May 23, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 23, 2026

Uh oh!

claude Bot commented May 23, 2026

Review (Claude)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented May 23, 2026 •

edited

Loading

[P2] wasm post-loop `flows(0)` makes `PREVIOUS` auxes stale-by-one after a full run, diverging from the VM and from `getSeries`