Skip to content

Add a selectable WebAssembly simulation engine to @simlin/engine#628

Open
bpowers wants to merge 27 commits into
mainfrom
engine-wasm-sim
Open

Add a selectable WebAssembly simulation engine to @simlin/engine#628
bpowers wants to merge 27 commits into
mainfrom
engine-wasm-sim

Conversation

@bpowers
Copy link
Copy Markdown
Owner

@bpowers bpowers commented May 23, 2026

Summary

Exposes the existing wasmgen WebAssembly backend through @simlin/engine as a selectable simulation engine, alongside the bytecode VM, for fast interactive re-simulation. The bytecode VM remains the default and the correctness oracle throughout.

  • Model.simulate({ engine: 'wasm' }) / Model.run(..., { engine: 'wasm' }) now run a model on a compiled, per-model wasm blob with full VM parity (within the engine's existing tolerances). engine defaults to 'vm', so every existing caller is byte-for-byte unchanged.
  • All vm-vs-wasm branching lives in DirectBackend; the public Sim/Model/Run classes are structurally unchanged. The only EngineBackend change is an optional engine? parameter on simNew. Works under Node and through the browser Web Worker path (one additive engine field on the existing simNew message; no new message types).
  • Unsupported models and enableLtm on wasm are explicit errors with no silent VM fallback; getLinks throws on a wasm sim; an LTM-off Run carries empty links.

Design and structure

Implemented in 4 phases per docs/design-plans/2026-05-22-engine-wasm-sim.md:

  1. Resumable run ABI (Rust, simlin-engine). The wasm blob gains run_to(f64) and run_initials() exports alongside the unchanged run/set_value/reset/clear_values/memory/n_slots/n_chunks/results_offset; the per-step run cursor moves into mutable wasm globals so a run is resumable; reset clears the cursor while preserving constant overrides; run is re-expressed as reset(); run_to(stop). Purely additive (FFI signature and WasmLayout wire format unchanged).
  2. Node core (TS, @simlin/engine). New functional-core internal/wasmgen.ts (FFI compile wrapper + pure layout parser + strided series read) and internal/canonicalize.ts (Rust-faithful name canonicalizer); DirectBackend demuxes every sim op on the engine.
  3. Browser / worker (TS). The engine field is threaded through the worker simNew message; a WorkerBackend-driven wasm sim matches a node DirectBackend and the VM.
  4. Node benchmark (TS). A RUN_BENCH-gated VM-vs-wasm eval benchmark over fishbanks / WORLD3 / C-LEARN.

Benchmark (AC9.1, eval-only, compile excluded)

model wasm/VM eval speedup
fishbanks 5.4x
WORLD3 7.5x
C-LEARN 9.4x

(Per the project's "no stale benchmark data" rule the harness is committed but no results file; numbers above are from a local run and reproduce via RUN_BENCH=1 pnpm -C src/engine exec jest backend-bench.)

Test plan

All 25 acceptance criteria (engine-wasm-sim.AC1.* through AC9.*) have automated coverage; the bytecode VM is driven identically and compared as the oracle.

  • cargo test -p simlin-engine -- blob/VM-parity unit tests + always-on corpus wasm_parity_hook
  • cargo test -p simlin --test wasm -- resumable exports across the libsimlin FFI compile path
  • pnpm -C src/engine test -- 397 pass (wasm-backend, wasm-model, worker-wasm, wasmgen, canonicalize, bench-stats; gated benchmark skipped)
  • pnpm -C src/engine exec tsc --noEmit -- clean
  • Heavy real-model parity twins (gated): cargo test -p simlin-engine --features file_io --release -- --ignored simulates_wrld3_03_wasm simulates_clearn_wasm
  • Human verification plan: docs/test-plans/2026-05-22-engine-wasm-sim.md

Follow-ups (filed, out of scope here)

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4d16d7b5e9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/engine/src/direct-backend.ts Outdated
Comment on lines +501 to +506
if (entry.projectHandle !== undefined) {
this._projectChildren.get(entry.projectHandle)?.delete(handle as number);
}
simlin_sim_unref(entry.ptr);
// A wasm sim has no native sim pointer; dropping the entry lets the
// WebAssembly.Instance be GC'd. Only the VM path holds a native sim to unref.
if (entry.engine !== 'wasm') {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Remove disposed wasm sims from the handle map

Disposed wasm sims are never removed from _handles, so each HandleEntry keeps strong references to wasmInstance, wasmExports, and layout data indefinitely. In workloads that create/dispose many sims (e.g., interactive re-simulation loops), memory will grow unbounded even though simDispose was called, because GC cannot reclaim those instances while they remain reachable via the map.

Useful? React with 👍 / 👎.

Comment thread src/engine/src/direct-backend.ts Outdated
Comment on lines +567 to +569
if (entry.engine === 'wasm') {
// nChunks is the saved-row count == the VM's results.step_count.
return entry.wasmLayout!.nChunks;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return step count from wasm run state, not static layout

simGetStepCount on wasm always returns layout.nChunks, which is the total capacity, not the number of completed steps. This reports a fully-complete run even immediately after creation or after simReset, so callers that use step count for progress/completion logic will get incorrect results on wasm while VM-backed sims only expose completed results.

Useful? React with 👍 / 👎.

@bpowers
Copy link
Copy Markdown
Owner Author

bpowers commented May 23, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 26351d5f39

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/engine/src/direct-backend.ts Outdated
// One Float64Array(nChunks) read strided from the blob's results region.
// Read memory.buffer fresh per call (uniform with the singleton helpers).
const slot = this.wasmSlot(entry.wasmLayout!, name);
return readStridedSeries(entry.wasmExports!.memory.buffer, entry.wasmLayout!, slot);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Limit wasm series reads to completed steps

When engine === 'wasm', simGetSeries always returns nChunks rows by calling readStridedSeries(...), but a partially run sim has fewer committed rows (simGetStepCount uses saved_steps). This means getSeries() after runTo(t) can include uncommitted/stale tail values (especially after reset, which does not clear the results slab), unlike the VM path which truncates by step count. Interactive or incremental consumers will read incorrect time-series data unless the wasm path slices to completed steps.

Useful? React with 👍 / 👎.

bpowers added 26 commits May 22, 2026 19:43
Promote the wasmgen blob's per-step cursor from function locals to mutable
wasm globals (G_SAVED/G_STEP_ACCUM/G_DID_INITIALS) so a run can be advanced
incrementally across separate exported calls. This adds two exports to the
blob -- run_to(f64)->() and run_initials()->() -- mirroring the VM's
run_to/run_initials, and re-expresses run()->() as `reset(); run_to(stop)`
so there is exactly one stepping-loop implementation shared by run and
run_to.

The chosen path is the mandated delegation (run -> reset + run_to), not the
fallback: it satisfies the linchpin invariant (run() produces a full from-t0
simulation on every call to a reused instance) for free, because reset clears
the cursor globals and re-arms use_prev_fallback, so the now-idempotent
run_initials (guarded by G_DID_INITIALS) re-seeds the reserved time slots and
re-runs initials. emit_save_advance reads/writes the saved-row cursor from the
globals; emit_reset clears the cursor while deliberately leaving the constants
override region untouched (overrides survive reset, like the VM). The new
exports append after the original four (run/set_value/reset/clear_values), so
the export-set growth is purely additive and the FFI is unchanged.

A triple-agreement parity test (run vs run_initials+run_to(stop) vs
Vm::run_to_end) plus the existing wasm_parity_hook corpus catch any faithless
re-expression. The blob keeps its working curr/next chunks separate from the
results region, so a partial run_to(t) writes exactly the committed
save-cadence rows (the VM's chunk-ring leaks its working chunk into the
exported slab as one overshoot row -- a known layout difference the test
accounts for by reading the live curr chunk for AC2.2).
Add engine-level VM-parity tests for the resumable run_to driver: a segmented
run_to(t1); run_to(t2) reproduces a single run_to(t2) and the VM driven through
the same segments (the cursor globals survive across calls and resume exactly);
the committed save-cadence row count after run_to(t) matches the VM's for t on
and between save points; and run_to(stop*2) clamps to the slab end (equals
run_to(stop) and Vm::run_to_end, exactly n_chunks rows).

These exercise Task 1's driver only -- no production change. The count test
derives the VM's committed-row count analytically rather than by inspecting its
slab, because the VM's chunk-ring exports its working (overshoot) chunk as one
extra populated row; the blob keeps curr/next separate, so its results region
holds exactly the committed rows. AC2.2's "value after run_to(t)" is verified
by reading the blob's live curr chunk and matching the VM's get_value_now (both
land one step past an on-grid target). Three helpers (read_curr_slot,
vm_slab_segmented, live_saved_rows) support these tests.
Add two engine-level VM-parity tests, both driving a SINGLE instantiated module
across calls (the blob-level demonstration of instance reuse, AC5.4): reset then
re-run reproduces the compiled-default series and equals the VM with a reset
between two runs (AC3.1); and reset preserves a constant override set via
set_value -- both the override-applied series and the level-reaches-25 pin
survive the reset, matching the VM (which likewise does not clear overrides on
reset) (AC3.2).

These lock in Task 1's emit_reset extension (clear the cursor globals, keep the
constants region) -- no production change.
Add two engine-level VM-parity tests for set_value semantics. The mid-run test
drives one instance through run_initials; run_to(t1); set_value(const, v2);
run_to(stop) and matches the VM driven identically; it pins AC5.3's "affects
only steps after the cursor" by showing rows up to the override-application
point (one step past t1, since set_value runs after run_to(t1) advanced the
cursor) match a no-override baseline while later rows diverge. The rejection
test asserts the blob's set_value returns nonzero for a stock and a computed
flow and zero for an overridable constant -- the blob-level peer of the VM's
BadOverride rejection (AC5.2), cross-checked against the VM.

No production change: emit_set_value already rejects non-overridable offsets and
overridable constants are re-read from the const region each step, so a mid-run
override already affects only later steps. AC5.1 (full-run override) stays
covered by compile_simulation_set_value_override_matches_vm.
Subcomponent A added run_initials/run_to to the wasmgen blob and made
reset cursor-clearing while preserving constant overrides. This adds an
FFI-level parity test confirming those resumable exports survive the
simlin_model_compile_to_wasm path: the compiled-via-FFI blob is driven
run_initials -> run_to(t1) -> set_value(inflow_rate) -> run_to(stop) and
its level series is checked against a bytecode-VM oracle driven
identically through simlin_sim_run_to / simlin_sim_set_value /
simlin_sim_run_to_end. A reset-then-run on the same instance confirms
the override survives reset (the override-applied defaults reproduce),
the FFI peer of simlin_sim_reset. The blob is also asserted to still
carry every original export at its original kind, so the export-set
growth is purely additive; the FFI signature is unchanged (test-only).
Extend the two #[ignore]d whole-model wasm twins so each, in addition to
its existing single-run parity check, drives the same blob through a
two-segment run_to (split at the midpoint save time) and asserts the
segmented final series equals the single-run series -- which is already
gated against the model's oracle (the VM for WORLD3, Ref.vdf for
C-LEARN). This exercises the resumable cursor (run_initials then per-target
run_to) on real models at scale, where the small TestProject fixtures in
the engine-internal tests cannot.

Comparing the FULL final series (both segments driven through to stop) is
exact regardless of where the cursor paused mid-run, so the boundary need
only be a meaningful interior time. midpoint_save_time derives it from the
compiled sim's specs (start + (n_chunks/2) * max(save_step, dt), mirroring
Specs::from's effective cadence) rather than hardcoding, so the split lands
on a genuine save point for whatever time axis the model declares.

The segmented pass is deliberately NOT added to the per-corpus
wasm_parity_hook: that would double the wasm work across the whole corpus
and risk the 3-minute debug-test budget. Only these two #[ignore]d
release-only twins get it. wasm_results_for_segmented mirrors
wasm_results_for, reusing run_wasm_results_segmented (now pub).
Phase 1 of the @simlin/engine wasm simulation backend added two blob
exports (run_to, run_initials) and promoted the per-step run cursor
(saved/step_accum/did_initials) into internal mutable wasm globals so a
run survives across separate exported calls. The authoritative module
map still listed only the original run/set_value/reset/clear_values
exports and described run as a one-shot driver, so it understated the
resumable ABI the libsimlin map already documents for its FFI test.
Adds internal/wasmgen.ts: the WasmLayout/WasmBlobExports types, the
simlin_model_compile_to_wasm FFI wrapper (the two-out-buffer + error-ptr
shape mirroring model.ts callBufferReturningFn), and two pure functions
parseWasmLayout (decode the little-endian layout wire format) and
readStridedSeries (single-Float64Array strided f64 read of one variable's
series out of the blob's linear memory). The pure functions take plain
buffers rather than a live WebAssembly.Instance so they are unit-tested in
isolation against hand-built byte buffers. Registers the module in the
internal barrel.
Adds internal/canonicalize.ts: a pure canonicalizeIdent reproducing Rust
simlin-engine/src/common.rs canonicalize exactly, so a raw caller name
resolves to the same canonical key the wasm WasmLayout (and the VM's
get_var_names) uses. It ports the quote-aware IdentifierPartIterator
splitting, the quoted-inner-period -> U+2024 sentinel vs unquoted-period ->
U+00B7 module-separator distinction, the \\ -> \ unescape, the
whitespace-run + literal \n/\r escape collapse to a single underscore,
and lowercasing.

This is a deliberate engine-local copy rather than reusing the incomplete
@simlin/core canonicalize, which lacks dot/quote handling and is shared by
consumers whose behavior must not shift mid-feature; the two should be
unified later. The output was differential-checked byte-for-byte against
the Rust oracle across a broad input corpus.
Add an optional engine selector to EngineBackend.simNew ('vm' default,
'wasm' new) plus a shared SimEngine type. The wasm path compiles the model
to a self-contained WebAssembly blob via simlin_model_compile_to_wasm,
parses its layout, captures the model stop time, and instantiates the blob
import-free, storing the instance/layout/exports/stop-time on the handle
entry so the blob is owned exactly once and GC'd with the entry.

enableLtm is rejected up front for the wasm engine (it emits no LTM
instrumentation) before any compile work, and an unsupported model surfaces
the compile error with no VM fallback. Disposal (simDispose, projectDispose)
skips the native simlin_sim_unref for wasm entries, which carry no native
sim pointer. The engine param is optional so WorkerBackend (untouched until
Phase 3) still satisfies the widened interface. Per-op vm/wasm demux of the
sim operations themselves lands in the next commit.
Each sim op now fetches the handle entry and branches on entry.engine. The
'vm' path is unchanged; the 'wasm' path drives the blob's exports: run_to for
runTo/runToEnd (run_to(stopTime) for the latter), reset, set_value (throwing
on a nonzero rc to mirror the VM's constants-only BadOverride), a strided
single-Float64Array series read, nChunks for the step count, a base-0
curr-chunk DataView read for getValue/getTime, and a 569Xhlprefix-filtered,
code-point-sorted getVarNames. getLinks throws on the wasm engine (LTM is
VM-only). Names resolve through the Rust-faithful canonicalizeIdent.

getVarNames sorts by Unicode code point, not the default JS UTF-16 order, so
non-BMP names match Rust's byte-order sort. The reserved time/dt/initial_time/
final_time names are kept (the VM filters only $-prefixed internal vars).

VM-vs-wasm parity is asserted through DirectBackend with the VM as oracle.
getValue mid-run is the byte-identical base-0 curr-chunk read on both sides,
but they agree only on the integrated state (stocks + the reserved time vars)
mid-run: the VM's Euler loop breaks before evaluating the chunk whose time
exceeds the target (vm.rs:711), leaving its curr chunk's flow/aux/constant
slots stale, whereas the wasm blob fully evaluates its stopping step. After a
full run both agree on every variable. Segmented/clamped runs and all by-name
reads are compared via the fully-evaluated series and the integrated-state
getValue accordingly.
Sim.create gains an optional engine parameter (default 'vm') stored as a
private field and forwarded to backend.simNew; Model.simulate and Model.run
gain an optional engine option that forwards it, preserving run's
analyzeLtm->enableLtm naming. The enableLtm+wasm rejection stays authoritative
in DirectBackend.simNew, so the facade only forwards.

getRun now fetches LTM link scores only when LTM is enabled and the sim is not
wasm-backed (a wasm sim's getLinks throws), instead of fetching them
unconditionally. This makes Model.run({engine:'wasm'}) resolve to a Run with
empty links, and is invisible to the VM path: no consumer reads Run.links for
an LTM-off run, and the populated-but-scoreless links a VM teacup run produced
were never used.
The wasm simulation backend's name pipeline (canonicalizeIdent ->
wasmLayout.varOffsets lookup -> readStridedSeries) was only exercised
end-to-end on scalar teacup, so the non-scalar case it was written for
went untested. Add a statically-arrayed fixture (a static dimension, not
a dynamic view range, so it is wasm-supported) and assert that a raw,
mixed-case array-element name like `source[Boston]` resolves to the same
series as the VM through getSeries/getVarNames -- the keys contain the
bracketed element separators that scalar names never produce.

Also close the facade loop for two paths previously covered only at the
DirectBackend level: model.simulate({enableLtm:true, engine:'wasm'})
must reject up front (the facade forwards the option to the backend's
authoritative rejection), and Sim.reset()'s reset+reapply-overrides path
on a wasm sim must reproduce the override-applied result, matching a VM
Sim driven identically. The VM is the oracle throughout, compared within
1e-9.
Run prettier --write over the files this branch created or modified
(wasmgen.ts, canonicalize.ts, backend.ts, model.ts, wasmgen.test.ts).
The pre-commit hook does not run prettier --check, so these reflows
slipped the gate. Pure formatting: line joining/wrapping to the 120-col
printWidth, no semantic changes.
The wasm simulation demux already lives inside the Worker because
WorkerServer wraps a DirectBackend; the only missing piece for a
browser caller to select engine:'wasm' was the postMessage protocol.
Add an optional engine field to the simNew request variant (reusing
the exported SimEngine type), forward request.engine in the server's
simNew case, and accept the engine arg in WorkerBackend.simNew.

Purely additive: an absent engine produces the byte-identical message
shape as before, so the VM path and every existing caller are
unchanged. VALID_REQUEST_TYPES/isValidRequest are field-agnostic and
untouched, and no response shape changes (the handle still returns
inside result).
Drive engine:'wasm' end-to-end through the real postMessage protocol
via the in-memory loopback (a real WorkerBackend wired to a real
WorkerServer, which wraps a DirectBackend). A node DirectBackend is the
oracle: the worker-driven wasm series equals the DirectBackend wasm
series exactly and the VM series within the engine's parity tolerance
(wasm is not bit-identical to the VM's libm by design).

Beyond parity, the tests pin the Phase 3 contract that closes the
deferred Minor #4 (a browser caller silently getting a VM sim): the
enableLtm-on-wasm rejection and a wasm-unsupported model both reject
across the worker boundary with no silent VM fallback, and the same
unsupported model still runs on the VM through the same worker. They
also assert the served simNew request carries engine:'wasm', that an
absent engine reproduces the prior message shape, that no new message
type appears on the wire, and that getSeries still ships its
Float64Array zero-copy (exactly one one-element transfer on the
response leg).
A side-effect-free statistic + warmup/measure policy shared by the node
VM-vs-wasm benchmark. Both the timed body and the wall clock are injected,
so the median statistic and the discard-warmup / adaptive-budget iteration
loop are deterministically unit-tested without touching the clock or any
model. Mirrors the Stat/bench pair of src/simlin-engine/examples/backend_bench.rs,
adding the explicit warmup phase AC9.1 requires; runTimedAsync is the async
twin so the benchmark's timed region is single-sourced and tested.
A gated jest benchmark that times simulation (eval) only -- the wasm blob
compile/instantiate happens once in untimed setup and result extraction is
excluded -- through the public Model.simulate({engine}) API, mirroring
backend_bench.rs's eval-vs-eval methodology with the explicit warmup AC9.1
requires. reset() runs inside the measured body but before the clock so only
runToEnd() is timed. A pre-timing cross-check runs both engines to completion
and compares every variable's full series via the pure seriesClose predicate,
using the engine's corpus-wide VM-vs-wasm parity tolerance (ensure_results:
2e-3 absolute / near-zero guard) rather than the teacup-only 1e-9 -- a large
model run to its final time accumulates benign floating-point reassociation
noise (C-LEARN ~2.5e-9 on an O(0.1) value) far inside that bound.

The heavy run is RUN_BENCH-gated so the default jest stays within the
few-seconds-per-test budget; only the harness is committed, with results
reported in-chat per the no-stale-benchmark-data rule.
… tests

The pure cross-check comparator seriesClose flagged a divergence only when
Math.abs(e - a) > PARITY_ABS_TOL. A NaN on one side made that NaN > tol, which
is false, so a NaN-vs-finite (or NaN-vs-NaN) disagreement was silently reported
as a match. This is the opposite of the Rust oracle it ports
(test_helpers.rs::ensure_results), where approx_eq!(NaN, anything) is false and
NaN is never around-zero, so any NaN panics the comparison. Because the wasm
backend can legitimately emit genuine NaN (out-of-bounds vector reads, empty
array reducers), a NaN-on-one-engine divergence is exactly the broken run the
cross-check exists to reject, and the gap was latent for any future model. Now
a non-finite value on either side at an index is a mismatch, checked before the
tolerance comparison.

Also tightens the two budget-stop harness tests to pin the exact deterministic
iteration count (4, derived from the fakeClock(20) read sequence) instead of a
loose >= 2 && < 100, so an off-by-one in the budget guard would be caught; and
documents defaultNow() as the module's single deliberate impurity -- a
default-injection seam analogous to the FCIS logger exception, with all pure
logic taking now as a parameter so the core stays deterministically testable.
The 4-phase @simlin/engine WebAssembly simulation backend added a
'wasm' alternative to the bytecode VM, selectable via
Model.simulate/run({ engine }), with a per-op DirectBackend demux, two
new internal modules (wasmgen.ts, canonicalize.ts), an additive worker
simNew engine field, and a RUN_BENCH-gated node VM-vs-wasm eval
benchmark. Capture that contract surface and its restrictions (LTM/wasm
rejected, getLinks throws, no VM fallback, wasm Run carries empty links)
in src/engine/CLAUDE.md, and drop the forbidden 'Last verified' line
while editing the file.
simGetStepCount on a wasm sim returned the layout's nChunks unconditionally,
which is the results-slab capacity, not the number of completed steps. A fresh
or just-reset wasm sim therefore reported a fully-complete run though no rows
had been saved, violating the documented "steps completed" contract and
diverging from the VM (whose count only becomes nonzero once Results exist).

The blob already tracks the live saved-row count in the internal mutable global
G_SAVED (0 before any run / after reset, n_chunks after a full run). Export it
additively as `saved_steps` and read it in the wasm branch of simGetStepCount.
The mutable-global export is standard wasm and still validates under the DLR-FT
interpreter and survives wasm-opt (--enable-mutable-globals). Parity after a
full run is preserved: G_SAVED == n_chunks == the VM's count.
A disposed wasm sim entry was marked `disposed` but left in the `_handles` map
with its WebAssembly.Instance, exports, and decoded layout still attached. The
old comment claimed dropping the entry let the instance be GC'd, but nothing
dropped it -- so each create/dispose cycle pinned a whole instance + layout
indefinitely and memory grew unbounded even though simDispose had been called.

The entry must stay in the map: existing tests rely on the disposed tombstone
to surface a clear "has been disposed" error on use-after-dispose, and getEntry
checks `disposed` before touching any wasm field, so nulling the heavy refs is
safe. Add a releaseWasmSimState helper that clears the three heavy refs and call
it from both simDispose and the projectDispose child-sim cascade; wasmStopTime
is a plain number and is left in place.
The wasm sim's getSeries always read nChunks rows -- the full results-slab
capacity -- regardless of how many steps had actually run. The slab keeps
its capacity across a partial runTo(t) and across reset (reset clears the
run cursor but does not zero the slab), so a partially run or just-reset
wasm sim surfaced uncommitted/stale tail rows. The VM path truncates by the
completed-step count (its get_series returns only saved rows mid-run and the
FFI further bounds the read by the passed count), so the wasm twin violated
VM parity and the getSeries().length == getStepCount() contract precisely in
the interactive-scrubbing scenario the wasm backend exists for.

readStridedSeries now takes a completed-step count (defaulting to and clamped
at nChunks so the pure reader never strides past the results region), and
simGetSeries passes the live saved_steps count on both engines.
@bpowers bpowers force-pushed the engine-wasm-sim branch from 26351d5 to e7cbd33 Compare May 23, 2026 02:45
@bpowers
Copy link
Copy Markdown
Owner Author

bpowers commented May 23, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e7cbd337fc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +570 to +573
if (entry.engine === 'wasm') {
// Phase-1 reset: clears the run cursor, preserves constant overrides.
entry.wasmExports!.reset();
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reinitialize wasm curr state when resetting a sim

simReset on the wasm engine only calls reset(), which clears cursor globals but does not repopulate the live curr chunk that simGetTime/simGetValue read. After runTo/runToEnd, a reset() followed by getTime() (or any getValue()) can therefore return stale values from the previous run instead of the model’s initial state, violating the Sim.reset() contract and diverging from VM behavior until another run_to(...) call happens.

Useful? React with 👍 / 👎.

@claude
Copy link
Copy Markdown

claude Bot commented May 23, 2026

Automated review

I reviewed the introduced code paths in depth — the Rust resumable run ABI (wasmgen/module.rs), the new TS modules (internal/wasmgen.ts, internal/canonicalize.ts), the DirectBackend vm/wasm demux, the Model/Sim facade threading, and the worker protocol plumbing.

I did not find any blocking bugs. Notes from the areas I scrutinized most closely:

  • Resumable ABI (module.rs). The cursor migration from run's function-locals (L_SAVED/L_STEP_ACCUM) to mutable globals (G_SAVED/G_STEP_ACCUM/G_DID_INITIALS) is consistent: run_fn_index_of is used identically at emit time and assemble time so the delegation targets can't drift, L_DST stays at index 2 via the filler i32 so the shared per-step emitters keep working, and the did_initials idempotency guard plus run = reset; run_to(stop) delegation preserve the "full from-t0 run on every run" invariant. New exports are appended after the original four, keeping indices stable.
  • canonicalize.ts is a faithful port of common.rs::canonicalize / replace_whitespace_with_underscore / IdentifierPartIterator; the UTF-16-unit iteration reconstructs the same strings as Rust's char iteration for non-BMP input.
  • compareByCodePoint matches Rust's str sort (UTF-8 byte order equals code-point order for valid Unicode), and the simGetVarNames $-prefix filter matches libsimlin's is_internal_var.
  • DirectBackend demux. Every sim op branches on engine before touching entry.ptr, and a wasm entry's ptr === 0 is never handed to a native FFI call (including both the simDispose and project-dispose paths). copyFromWasm returns an exact-size buffer, so wasm.buffer as ArrayBuffer for WebAssembly.Module is sound; wasmStopTime mirrors Model.timeSpec()'s endTime ?? 10.
  • Sim.getRun. The ltmEnabled && _engine !== 'wasm' link gate is backed by a real getter; the empty-links-for-wasm/LTM-off behavior is the documented engine: LTM-off Sim.getRun() now returns Run.links=[] instead of populated-scoreless (AC1.3 deviation) #626 follow-up, not an accidental regression.
  • WasmLayout.var_offsets is built from the same sim.offsets the VM resolves names against, so by-name reads (including time) stay consistent across engines.

The parity test surface (the always-on Rust corpus wasm_parity_hook, the segmented run_to twins, and the TS wasm-backend/wasm-model/worker-wasm suites) covers the cases I would otherwise have wanted to verify by hand.

Overall correctness verdict: correct

No blocking issues found. The change is additive (defaults to engine: 'vm', existing callers unchanged), the wasm path is held to VM parity by tests, and the unsupported/LTM/getLinks restrictions are enforced with explicit errors and no silent fallback.

The wasm sim's reset only called the blob's reset(), which clears the run
cursor and preserves constant overrides but leaves the previous run's values
in the live curr chunk (linear-memory base 0) that simGetTime/simGetValue
read. So a runToEnd followed by reset then getTime()/getValue() returned the
stale end-of-run values instead of the fresh pre-run state, diverging from the
VM until the next run_to repopulated curr.

The blob's reset faithfully mirrors Vm::reset(), which likewise leaves stale
values and defers re-init to the next run. The VM stack still presents a fresh
state after reset because libsimlin recreates a zeroed VM; the host is the
right place to present that contract. simReset on the wasm path now zeros the
live curr chunk (the nSlots f64 at base 0) after reset(), so every by-name
read and getTime() return 0 -- matching a freshly-created sim and the VM --
rather than leaking the prior run's tail.
@bpowers
Copy link
Copy Markdown
Owner Author

bpowers commented May 23, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2fb132d2ed

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +623 to +624
const rc = entry.wasmExports!.set_value(slot, value);
if (rc !== 0) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Mirror constant overrides into current wasm state

After simSetValue on a wasm sim, this path only calls the blob export and returns. The blob’s set_value mutates the constants region used by future evaluation, but simGetValue reads from the live current chunk, so getValue() can return stale data (e.g., immediately after Model.simulate({overrides}, {engine:'wasm'}) or right after a manual setValue before advancing time). The VM updates current state on override, so this introduces observable VM/wasm API divergence for interactive reads.

Useful? React with 👍 / 👎.

@claude
Copy link
Copy Markdown

claude Bot commented May 23, 2026

Code review

Reviewed the full diff with the bytecode VM treated as the correctness oracle (matching the PR's own design). Focus areas: the Rust wasmgen/module.rs resumable-run refactor, the new TS functional cores (internal/wasmgen.ts, internal/canonicalize.ts), the DirectBackend engine demux, and the worker threading.

Findings

No blocking issues found.

Notes from the review (all checked out as correct, not findings):

  • Resource lifecycle: every native-sim FFI call (run_to/reset/get_value/get_series/unref, etc.) sits behind the entry.engine === 'wasm' early-return, and both simlin_sim_unref sites (direct-backend.ts:298, :529) are guarded — so a wasm sim's ptr === 0 is never passed to libsimlin, and disposal releases the heavy WebAssembly.Instance/layout while keeping the diagnostic tombstone.
  • wasm.buffer as ArrayBuffer (direct-backend.ts): sound, since copyFromWasm returns a freshly-allocated, exactly-sized, byteOffset-0 Uint8Array, so the backing buffer matches the blob bytes precisely.
  • canonicalizeIdent: faithful port of common.rs::canonicalize — quote-aware split, literal-period sentinel (U+2024) vs module separator (U+00B7), non-overlapping \\\ collapse, and whitespace-run handling all match; the UTF-16-unit iteration in replaceWhitespaceWithUnderscore is byte-equivalent to Rust's code-point iteration for the relevant (all-BMP) characters.
  • Rust cursor refactor: moving saved/step_accum/did_initials into mutable globals keeps L_DST/L_SAVED_TIME/L_RK_S at indices 2/3/4 so the shared per-step emitters are unaffected; run re-expressed as reset; run_to(stop) keeps a single stepping loop. Covered by the always-on corpus parity hook plus the new two-segment run_to parity twins on WORLD3/C-LEARN.
  • simReset curr-chunk zeroing: validated against the VM fresh-state per variable (wasm-backend.test.ts:557), not asserted as a bare 0.
  • Known VM-vs-wasm deviations (mid-run_to getValue for non-stock vars; LTM-off Run.links = []) are explicitly tracked as out-of-scope follow-ups (engine: mid-run getValue diverges between VM and wasm for non-stock vars (live curr-chunk overshoot handling) #625, engine: LTM-off Sim.getRun() now returns Run.links=[] instead of populated-scoreless (AC1.3 deviation) #626).

Overall correctness verdict: correct

The change is additive (FFI signature and WasmLayout wire format unchanged; engine defaults to 'vm'), error semantics are explicit with no silent VM fallback, and the wasm path is held to VM parity by tests at the blob, FFI, DirectBackend, Model/Sim, and Web Worker layers. I did not find a defect that would break existing code/tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant