Add a selectable WebAssembly simulation engine to @simlin/engine by bpowers · Pull Request #628 · bpowers/simlin

bpowers · 2026-05-23T00:39:02Z

Summary

Exposes the existing wasmgen WebAssembly backend through @simlin/engine as a selectable simulation engine, alongside the bytecode VM, for fast interactive re-simulation. The bytecode VM remains the default and the correctness oracle throughout.

Model.simulate({ engine: 'wasm' }) / Model.run(..., { engine: 'wasm' }) now run a model on a compiled, per-model wasm blob with full VM parity (within the engine's existing tolerances). engine defaults to 'vm', so every existing caller is byte-for-byte unchanged.
All vm-vs-wasm branching lives in DirectBackend; the public Sim/Model/Run classes are structurally unchanged. The only EngineBackend change is an optional engine? parameter on simNew. Works under Node and through the browser Web Worker path (one additive engine field on the existing simNew message; no new message types).
Unsupported models and enableLtm on wasm are explicit errors with no silent VM fallback; getLinks throws on a wasm sim; an LTM-off Run carries empty links.

Design and structure

Implemented in 4 phases per docs/design-plans/2026-05-22-engine-wasm-sim.md:

Resumable run ABI (Rust, simlin-engine). The wasm blob gains run_to(f64) and run_initials() exports alongside the unchanged run/set_value/reset/clear_values/memory/n_slots/n_chunks/results_offset; the per-step run cursor moves into mutable wasm globals so a run is resumable; reset clears the cursor while preserving constant overrides; run is re-expressed as reset(); run_to(stop). Purely additive (FFI signature and WasmLayout wire format unchanged).
Node core (TS, @simlin/engine). New functional-core internal/wasmgen.ts (FFI compile wrapper + pure layout parser + strided series read) and internal/canonicalize.ts (Rust-faithful name canonicalizer); DirectBackend demuxes every sim op on the engine.
Browser / worker (TS). The engine field is threaded through the worker simNew message; a WorkerBackend-driven wasm sim matches a node DirectBackend and the VM.
Node benchmark (TS). A RUN_BENCH-gated VM-vs-wasm eval benchmark over fishbanks / WORLD3 / C-LEARN.

Benchmark (AC9.1, eval-only, compile excluded)

model	wasm/VM eval speedup
fishbanks	5.4x
WORLD3	7.5x
C-LEARN	9.4x

(Per the project's "no stale benchmark data" rule the harness is committed but no results file; numbers above are from a local run and reproduce via RUN_BENCH=1 pnpm -C src/engine exec jest backend-bench.)

Test plan

All 25 acceptance criteria (engine-wasm-sim.AC1.* through AC9.*) have automated coverage; the bytecode VM is driven identically and compared as the oracle.

cargo test -p simlin-engine -- blob/VM-parity unit tests + always-on corpus wasm_parity_hook
cargo test -p simlin --test wasm -- resumable exports across the libsimlin FFI compile path
pnpm -C src/engine test -- 397 pass (wasm-backend, wasm-model, worker-wasm, wasmgen, canonicalize, bench-stats; gated benchmark skipped)
pnpm -C src/engine exec tsc --noEmit -- clean
Heavy real-model parity twins (gated): cargo test -p simlin-engine --features file_io --release -- --ignored simulates_wrld3_03_wasm simulates_clearn_wasm
Human verification plan: docs/test-plans/2026-05-22-engine-wasm-sim.md

Follow-ups (filed, out of scope here)

ts: two divergent canonicalize implementations (@simlin/core incomplete vs @simlin/engine Rust-faithful) #624 -- unify the engine-local Rust-faithful canonicalizer with the incomplete @simlin/core one
engine: mid-run getValue diverges between VM and wasm for non-stock vars (live curr-chunk overshoot handling) #625 -- mid-run_to getValue diverges for non-stock vars (VM vs wasm overshoot-chunk handling); by design, documented
engine: LTM-off Sim.getRun() now returns Run.links=[] instead of populated-scoreless (AC1.3 deviation) #626 -- LTM-off Run.links is now [] (AC1.3-wording deviation); consumer-invisible, by design
build/CI: no prettier --check gate; formatting drift accumulating on main #627 -- add a prettier --check gate (pre-existing formatting drift on main)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4d16d7b5e9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-23T00:44:02Z

    if (entry.projectHandle !== undefined) {
      this._projectChildren.get(entry.projectHandle)?.delete(handle as number);
    }
-    simlin_sim_unref(entry.ptr);
+    // A wasm sim has no native sim pointer; dropping the entry lets the
+    // WebAssembly.Instance be GC'd. Only the VM path holds a native sim to unref.
+    if (entry.engine !== 'wasm') {


Remove disposed wasm sims from the handle map

Disposed wasm sims are never removed from _handles, so each HandleEntry keeps strong references to wasmInstance, wasmExports, and layout data indefinitely. In workloads that create/dispose many sims (e.g., interactive re-simulation loops), memory will grow unbounded even though simDispose was called, because GC cannot reclaim those instances while they remain reachable via the map.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-23T00:44:02Z

+    if (entry.engine === 'wasm') {
+      // nChunks is the saved-row count == the VM's results.step_count.
+      return entry.wasmLayout!.nChunks;


Return step count from wasm run state, not static layout

simGetStepCount on wasm always returns layout.nChunks, which is the total capacity, not the number of completed steps. This reports a fully-complete run even immediately after creation or after simReset, so callers that use step count for progress/completion logic will get incorrect results on wasm while VM-backed sims only expose completed results.

Useful? React with 👍 / 👎.

bpowers · 2026-05-23T02:21:50Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 26351d5f39

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-23T02:26:02Z

+      // One Float64Array(nChunks) read strided from the blob's results region.
+      // Read memory.buffer fresh per call (uniform with the singleton helpers).
+      const slot = this.wasmSlot(entry.wasmLayout!, name);
+      return readStridedSeries(entry.wasmExports!.memory.buffer, entry.wasmLayout!, slot);


Limit wasm series reads to completed steps

When engine === 'wasm', simGetSeries always returns nChunks rows by calling readStridedSeries(...), but a partially run sim has fewer committed rows (simGetStepCount uses saved_steps). This means getSeries() after runTo(t) can include uncommitted/stale tail values (especially after reset, which does not clear the results slab), unlike the VM path which truncates by step count. Interactive or incremental consumers will read incorrect time-series data unless the wasm path slices to completed steps.

Useful? React with 👍 / 👎.

Promote the wasmgen blob's per-step cursor from function locals to mutable wasm globals (G_SAVED/G_STEP_ACCUM/G_DID_INITIALS) so a run can be advanced incrementally across separate exported calls. This adds two exports to the blob -- run_to(f64)->() and run_initials()->() -- mirroring the VM's run_to/run_initials, and re-expresses run()->() as `reset(); run_to(stop)` so there is exactly one stepping-loop implementation shared by run and run_to. The chosen path is the mandated delegation (run -> reset + run_to), not the fallback: it satisfies the linchpin invariant (run() produces a full from-t0 simulation on every call to a reused instance) for free, because reset clears the cursor globals and re-arms use_prev_fallback, so the now-idempotent run_initials (guarded by G_DID_INITIALS) re-seeds the reserved time slots and re-runs initials. emit_save_advance reads/writes the saved-row cursor from the globals; emit_reset clears the cursor while deliberately leaving the constants override region untouched (overrides survive reset, like the VM). The new exports append after the original four (run/set_value/reset/clear_values), so the export-set growth is purely additive and the FFI is unchanged. A triple-agreement parity test (run vs run_initials+run_to(stop) vs Vm::run_to_end) plus the existing wasm_parity_hook corpus catch any faithless re-expression. The blob keeps its working curr/next chunks separate from the results region, so a partial run_to(t) writes exactly the committed save-cadence rows (the VM's chunk-ring leaks its working chunk into the exported slab as one overshoot row -- a known layout difference the test accounts for by reading the live curr chunk for AC2.2).

Add engine-level VM-parity tests for the resumable run_to driver: a segmented run_to(t1); run_to(t2) reproduces a single run_to(t2) and the VM driven through the same segments (the cursor globals survive across calls and resume exactly); the committed save-cadence row count after run_to(t) matches the VM's for t on and between save points; and run_to(stop*2) clamps to the slab end (equals run_to(stop) and Vm::run_to_end, exactly n_chunks rows). These exercise Task 1's driver only -- no production change. The count test derives the VM's committed-row count analytically rather than by inspecting its slab, because the VM's chunk-ring exports its working (overshoot) chunk as one extra populated row; the blob keeps curr/next separate, so its results region holds exactly the committed rows. AC2.2's "value after run_to(t)" is verified by reading the blob's live curr chunk and matching the VM's get_value_now (both land one step past an on-grid target). Three helpers (read_curr_slot, vm_slab_segmented, live_saved_rows) support these tests.

Add two engine-level VM-parity tests, both driving a SINGLE instantiated module across calls (the blob-level demonstration of instance reuse, AC5.4): reset then re-run reproduces the compiled-default series and equals the VM with a reset between two runs (AC3.1); and reset preserves a constant override set via set_value -- both the override-applied series and the level-reaches-25 pin survive the reset, matching the VM (which likewise does not clear overrides on reset) (AC3.2). These lock in Task 1's emit_reset extension (clear the cursor globals, keep the constants region) -- no production change.

Add two engine-level VM-parity tests for set_value semantics. The mid-run test drives one instance through run_initials; run_to(t1); set_value(const, v2); run_to(stop) and matches the VM driven identically; it pins AC5.3's "affects only steps after the cursor" by showing rows up to the override-application point (one step past t1, since set_value runs after run_to(t1) advanced the cursor) match a no-override baseline while later rows diverge. The rejection test asserts the blob's set_value returns nonzero for a stock and a computed flow and zero for an overridable constant -- the blob-level peer of the VM's BadOverride rejection (AC5.2), cross-checked against the VM. No production change: emit_set_value already rejects non-overridable offsets and overridable constants are re-read from the const region each step, so a mid-run override already affects only later steps. AC5.1 (full-run override) stays covered by compile_simulation_set_value_override_matches_vm.

Subcomponent A added run_initials/run_to to the wasmgen blob and made reset cursor-clearing while preserving constant overrides. This adds an FFI-level parity test confirming those resumable exports survive the simlin_model_compile_to_wasm path: the compiled-via-FFI blob is driven run_initials -> run_to(t1) -> set_value(inflow_rate) -> run_to(stop) and its level series is checked against a bytecode-VM oracle driven identically through simlin_sim_run_to / simlin_sim_set_value / simlin_sim_run_to_end. A reset-then-run on the same instance confirms the override survives reset (the override-applied defaults reproduce), the FFI peer of simlin_sim_reset. The blob is also asserted to still carry every original export at its original kind, so the export-set growth is purely additive; the FFI signature is unchanged (test-only).

Extend the two #[ignore]d whole-model wasm twins so each, in addition to its existing single-run parity check, drives the same blob through a two-segment run_to (split at the midpoint save time) and asserts the segmented final series equals the single-run series -- which is already gated against the model's oracle (the VM for WORLD3, Ref.vdf for C-LEARN). This exercises the resumable cursor (run_initials then per-target run_to) on real models at scale, where the small TestProject fixtures in the engine-internal tests cannot. Comparing the FULL final series (both segments driven through to stop) is exact regardless of where the cursor paused mid-run, so the boundary need only be a meaningful interior time. midpoint_save_time derives it from the compiled sim's specs (start + (n_chunks/2) * max(save_step, dt), mirroring Specs::from's effective cadence) rather than hardcoding, so the split lands on a genuine save point for whatever time axis the model declares. The segmented pass is deliberately NOT added to the per-corpus wasm_parity_hook: that would double the wasm work across the whole corpus and risk the 3-minute debug-test budget. Only these two #[ignore]d release-only twins get it. wasm_results_for_segmented mirrors wasm_results_for, reusing run_wasm_results_segmented (now pub).

Phase 1 of the @simlin/engine wasm simulation backend added two blob exports (run_to, run_initials) and promoted the per-step run cursor (saved/step_accum/did_initials) into internal mutable wasm globals so a run survives across separate exported calls. The authoritative module map still listed only the original run/set_value/reset/clear_values exports and described run as a one-shot driver, so it understated the resumable ABI the libsimlin map already documents for its FFI test.

Adds internal/wasmgen.ts: the WasmLayout/WasmBlobExports types, the simlin_model_compile_to_wasm FFI wrapper (the two-out-buffer + error-ptr shape mirroring model.ts callBufferReturningFn), and two pure functions parseWasmLayout (decode the little-endian layout wire format) and readStridedSeries (single-Float64Array strided f64 read of one variable's series out of the blob's linear memory). The pure functions take plain buffers rather than a live WebAssembly.Instance so they are unit-tested in isolation against hand-built byte buffers. Registers the module in the internal barrel.

Adds internal/canonicalize.ts: a pure canonicalizeIdent reproducing Rust simlin-engine/src/common.rs canonicalize exactly, so a raw caller name resolves to the same canonical key the wasm WasmLayout (and the VM's get_var_names) uses. It ports the quote-aware IdentifierPartIterator splitting, the quoted-inner-period -> U+2024 sentinel vs unquoted-period -> U+00B7 module-separator distinction, the \\ -> \ unescape, the whitespace-run + literal \n/\r escape collapse to a single underscore, and lowercasing. This is a deliberate engine-local copy rather than reusing the incomplete @simlin/core canonicalize, which lacks dot/quote handling and is shared by consumers whose behavior must not shift mid-feature; the two should be unified later. The output was differential-checked byte-for-byte against the Rust oracle across a broad input corpus.

Add an optional engine selector to EngineBackend.simNew ('vm' default, 'wasm' new) plus a shared SimEngine type. The wasm path compiles the model to a self-contained WebAssembly blob via simlin_model_compile_to_wasm, parses its layout, captures the model stop time, and instantiates the blob import-free, storing the instance/layout/exports/stop-time on the handle entry so the blob is owned exactly once and GC'd with the entry. enableLtm is rejected up front for the wasm engine (it emits no LTM instrumentation) before any compile work, and an unsupported model surfaces the compile error with no VM fallback. Disposal (simDispose, projectDispose) skips the native simlin_sim_unref for wasm entries, which carry no native sim pointer. The engine param is optional so WorkerBackend (untouched until Phase 3) still satisfies the widened interface. Per-op vm/wasm demux of the sim operations themselves lands in the next commit.

Each sim op now fetches the handle entry and branches on entry.engine. The 'vm' path is unchanged; the 'wasm' path drives the blob's exports: run_to for runTo/runToEnd (run_to(stopTime) for the latter), reset, set_value (throwing on a nonzero rc to mirror the VM's constants-only BadOverride), a strided single-Float64Array series read, nChunks for the step count, a base-0 curr-chunk DataView read for getValue/getTime, and a 569Xhlprefix-filtered, code-point-sorted getVarNames. getLinks throws on the wasm engine (LTM is VM-only). Names resolve through the Rust-faithful canonicalizeIdent. getVarNames sorts by Unicode code point, not the default JS UTF-16 order, so non-BMP names match Rust's byte-order sort. The reserved time/dt/initial_time/ final_time names are kept (the VM filters only $-prefixed internal vars). VM-vs-wasm parity is asserted through DirectBackend with the VM as oracle. getValue mid-run is the byte-identical base-0 curr-chunk read on both sides, but they agree only on the integrated state (stocks + the reserved time vars) mid-run: the VM's Euler loop breaks before evaluating the chunk whose time exceeds the target (vm.rs:711), leaving its curr chunk's flow/aux/constant slots stale, whereas the wasm blob fully evaluates its stopping step. After a full run both agree on every variable. Segmented/clamped runs and all by-name reads are compared via the fully-evaluated series and the integrated-state getValue accordingly.

Sim.create gains an optional engine parameter (default 'vm') stored as a private field and forwarded to backend.simNew; Model.simulate and Model.run gain an optional engine option that forwards it, preserving run's analyzeLtm->enableLtm naming. The enableLtm+wasm rejection stays authoritative in DirectBackend.simNew, so the facade only forwards. getRun now fetches LTM link scores only when LTM is enabled and the sim is not wasm-backed (a wasm sim's getLinks throws), instead of fetching them unconditionally. This makes Model.run({engine:'wasm'}) resolve to a Run with empty links, and is invisible to the VM path: no consumer reads Run.links for an LTM-off run, and the populated-but-scoreless links a VM teacup run produced were never used.

The wasm simulation backend's name pipeline (canonicalizeIdent -> wasmLayout.varOffsets lookup -> readStridedSeries) was only exercised end-to-end on scalar teacup, so the non-scalar case it was written for went untested. Add a statically-arrayed fixture (a static dimension, not a dynamic view range, so it is wasm-supported) and assert that a raw, mixed-case array-element name like `source[Boston]` resolves to the same series as the VM through getSeries/getVarNames -- the keys contain the bracketed element separators that scalar names never produce. Also close the facade loop for two paths previously covered only at the DirectBackend level: model.simulate({enableLtm:true, engine:'wasm'}) must reject up front (the facade forwards the option to the backend's authoritative rejection), and Sim.reset()'s reset+reapply-overrides path on a wasm sim must reproduce the override-applied result, matching a VM Sim driven identically. The VM is the oracle throughout, compared within 1e-9.

Run prettier --write over the files this branch created or modified (wasmgen.ts, canonicalize.ts, backend.ts, model.ts, wasmgen.test.ts). The pre-commit hook does not run prettier --check, so these reflows slipped the gate. Pure formatting: line joining/wrapping to the 120-col printWidth, no semantic changes.

The wasm simulation demux already lives inside the Worker because WorkerServer wraps a DirectBackend; the only missing piece for a browser caller to select engine:'wasm' was the postMessage protocol. Add an optional engine field to the simNew request variant (reusing the exported SimEngine type), forward request.engine in the server's simNew case, and accept the engine arg in WorkerBackend.simNew. Purely additive: an absent engine produces the byte-identical message shape as before, so the VM path and every existing caller are unchanged. VALID_REQUEST_TYPES/isValidRequest are field-agnostic and untouched, and no response shape changes (the handle still returns inside result).

Drive engine:'wasm' end-to-end through the real postMessage protocol via the in-memory loopback (a real WorkerBackend wired to a real WorkerServer, which wraps a DirectBackend). A node DirectBackend is the oracle: the worker-driven wasm series equals the DirectBackend wasm series exactly and the VM series within the engine's parity tolerance (wasm is not bit-identical to the VM's libm by design). Beyond parity, the tests pin the Phase 3 contract that closes the deferred Minor #4 (a browser caller silently getting a VM sim): the enableLtm-on-wasm rejection and a wasm-unsupported model both reject across the worker boundary with no silent VM fallback, and the same unsupported model still runs on the VM through the same worker. They also assert the served simNew request carries engine:'wasm', that an absent engine reproduces the prior message shape, that no new message type appears on the wire, and that getSeries still ships its Float64Array zero-copy (exactly one one-element transfer on the response leg).

A side-effect-free statistic + warmup/measure policy shared by the node VM-vs-wasm benchmark. Both the timed body and the wall clock are injected, so the median statistic and the discard-warmup / adaptive-budget iteration loop are deterministically unit-tested without touching the clock or any model. Mirrors the Stat/bench pair of src/simlin-engine/examples/backend_bench.rs, adding the explicit warmup phase AC9.1 requires; runTimedAsync is the async twin so the benchmark's timed region is single-sourced and tested.

A gated jest benchmark that times simulation (eval) only -- the wasm blob compile/instantiate happens once in untimed setup and result extraction is excluded -- through the public Model.simulate({engine}) API, mirroring backend_bench.rs's eval-vs-eval methodology with the explicit warmup AC9.1 requires. reset() runs inside the measured body but before the clock so only runToEnd() is timed. A pre-timing cross-check runs both engines to completion and compares every variable's full series via the pure seriesClose predicate, using the engine's corpus-wide VM-vs-wasm parity tolerance (ensure_results: 2e-3 absolute / near-zero guard) rather than the teacup-only 1e-9 -- a large model run to its final time accumulates benign floating-point reassociation noise (C-LEARN ~2.5e-9 on an O(0.1) value) far inside that bound. The heavy run is RUN_BENCH-gated so the default jest stays within the few-seconds-per-test budget; only the harness is committed, with results reported in-chat per the no-stale-benchmark-data rule.

… tests The pure cross-check comparator seriesClose flagged a divergence only when Math.abs(e - a) > PARITY_ABS_TOL. A NaN on one side made that NaN > tol, which is false, so a NaN-vs-finite (or NaN-vs-NaN) disagreement was silently reported as a match. This is the opposite of the Rust oracle it ports (test_helpers.rs::ensure_results), where approx_eq!(NaN, anything) is false and NaN is never around-zero, so any NaN panics the comparison. Because the wasm backend can legitimately emit genuine NaN (out-of-bounds vector reads, empty array reducers), a NaN-on-one-engine divergence is exactly the broken run the cross-check exists to reject, and the gap was latent for any future model. Now a non-finite value on either side at an index is a mismatch, checked before the tolerance comparison. Also tightens the two budget-stop harness tests to pin the exact deterministic iteration count (4, derived from the fakeClock(20) read sequence) instead of a loose >= 2 && < 100, so an off-by-one in the budget guard would be caught; and documents defaultNow() as the module's single deliberate impurity -- a default-injection seam analogous to the FCIS logger exception, with all pure logic taking now as a parameter so the core stays deterministically testable.

The 4-phase @simlin/engine WebAssembly simulation backend added a 'wasm' alternative to the bytecode VM, selectable via Model.simulate/run({ engine }), with a per-op DirectBackend demux, two new internal modules (wasmgen.ts, canonicalize.ts), an additive worker simNew engine field, and a RUN_BENCH-gated node VM-vs-wasm eval benchmark. Capture that contract surface and its restrictions (LTM/wasm rejected, getLinks throws, no VM fallback, wasm Run carries empty links) in src/engine/CLAUDE.md, and drop the forbidden 'Last verified' line while editing the file.

…ments

simGetStepCount on a wasm sim returned the layout's nChunks unconditionally, which is the results-slab capacity, not the number of completed steps. A fresh or just-reset wasm sim therefore reported a fully-complete run though no rows had been saved, violating the documented "steps completed" contract and diverging from the VM (whose count only becomes nonzero once Results exist). The blob already tracks the live saved-row count in the internal mutable global G_SAVED (0 before any run / after reset, n_chunks after a full run). Export it additively as `saved_steps` and read it in the wasm branch of simGetStepCount. The mutable-global export is standard wasm and still validates under the DLR-FT interpreter and survives wasm-opt (--enable-mutable-globals). Parity after a full run is preserved: G_SAVED == n_chunks == the VM's count.

A disposed wasm sim entry was marked `disposed` but left in the `_handles` map with its WebAssembly.Instance, exports, and decoded layout still attached. The old comment claimed dropping the entry let the instance be GC'd, but nothing dropped it -- so each create/dispose cycle pinned a whole instance + layout indefinitely and memory grew unbounded even though simDispose had been called. The entry must stay in the map: existing tests rely on the disposed tombstone to surface a clear "has been disposed" error on use-after-dispose, and getEntry checks `disposed` before touching any wasm field, so nulling the heavy refs is safe. Add a releaseWasmSimState helper that clears the three heavy refs and call it from both simDispose and the projectDispose child-sim cascade; wasmStopTime is a plain number and is left in place.

The wasm sim's getSeries always read nChunks rows -- the full results-slab capacity -- regardless of how many steps had actually run. The slab keeps its capacity across a partial runTo(t) and across reset (reset clears the run cursor but does not zero the slab), so a partially run or just-reset wasm sim surfaced uncommitted/stale tail rows. The VM path truncates by the completed-step count (its get_series returns only saved rows mid-run and the FFI further bounds the read by the passed count), so the wasm twin violated VM parity and the getSeries().length == getStepCount() contract precisely in the interactive-scrubbing scenario the wasm backend exists for. readStridedSeries now takes a completed-step count (defaulting to and clamped at nChunks so the pure reader never strides past the results region), and simGetSeries passes the live saved_steps count on both engines.

bpowers · 2026-05-23T02:45:46Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e7cbd337fc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-23T02:50:22Z

+    if (entry.engine === 'wasm') {
+      // Phase-1 reset: clears the run cursor, preserves constant overrides.
+      entry.wasmExports!.reset();
+      return;


Reinitialize wasm curr state when resetting a sim

simReset on the wasm engine only calls reset(), which clears cursor globals but does not repopulate the live curr chunk that simGetTime/simGetValue read. After runTo/runToEnd, a reset() followed by getTime() (or any getValue()) can therefore return stale values from the previous run instead of the model’s initial state, violating the Sim.reset() contract and diverging from VM behavior until another run_to(...) call happens.

Useful? React with 👍 / 👎.

claude · 2026-05-23T02:55:21Z

Automated review

I reviewed the introduced code paths in depth — the Rust resumable run ABI (wasmgen/module.rs), the new TS modules (internal/wasmgen.ts, internal/canonicalize.ts), the DirectBackend vm/wasm demux, the Model/Sim facade threading, and the worker protocol plumbing.

I did not find any blocking bugs. Notes from the areas I scrutinized most closely:

Resumable ABI (module.rs). The cursor migration from run's function-locals (L_SAVED/L_STEP_ACCUM) to mutable globals (G_SAVED/G_STEP_ACCUM/G_DID_INITIALS) is consistent: run_fn_index_of is used identically at emit time and assemble time so the delegation targets can't drift, L_DST stays at index 2 via the filler i32 so the shared per-step emitters keep working, and the did_initials idempotency guard plus run = reset; run_to(stop) delegation preserve the "full from-t0 run on every run" invariant. New exports are appended after the original four, keeping indices stable.
canonicalize.ts is a faithful port of common.rs::canonicalize / replace_whitespace_with_underscore / IdentifierPartIterator; the UTF-16-unit iteration reconstructs the same strings as Rust's char iteration for non-BMP input.
compareByCodePoint matches Rust's str sort (UTF-8 byte order equals code-point order for valid Unicode), and the simGetVarNames $-prefix filter matches libsimlin's is_internal_var.
DirectBackend demux. Every sim op branches on engine before touching entry.ptr, and a wasm entry's ptr === 0 is never handed to a native FFI call (including both the simDispose and project-dispose paths). copyFromWasm returns an exact-size buffer, so wasm.buffer as ArrayBuffer for WebAssembly.Module is sound; wasmStopTime mirrors Model.timeSpec()'s endTime ?? 10.
Sim.getRun. The ltmEnabled && _engine !== 'wasm' link gate is backed by a real getter; the empty-links-for-wasm/LTM-off behavior is the documented engine: LTM-off Sim.getRun() now returns Run.links=[] instead of populated-scoreless (AC1.3 deviation) #626 follow-up, not an accidental regression.
WasmLayout.var_offsets is built from the same sim.offsets the VM resolves names against, so by-name reads (including time) stay consistent across engines.

The parity test surface (the always-on Rust corpus wasm_parity_hook, the segmented run_to twins, and the TS wasm-backend/wasm-model/worker-wasm suites) covers the cases I would otherwise have wanted to verify by hand.

Overall correctness verdict: correct

No blocking issues found. The change is additive (defaults to engine: 'vm', existing callers unchanged), the wasm path is held to VM parity by tests, and the unsupported/LTM/getLinks restrictions are enforced with explicit errors and no silent fallback.

The wasm sim's reset only called the blob's reset(), which clears the run cursor and preserves constant overrides but leaves the previous run's values in the live curr chunk (linear-memory base 0) that simGetTime/simGetValue read. So a runToEnd followed by reset then getTime()/getValue() returned the stale end-of-run values instead of the fresh pre-run state, diverging from the VM until the next run_to repopulated curr. The blob's reset faithfully mirrors Vm::reset(), which likewise leaves stale values and defers re-init to the next run. The VM stack still presents a fresh state after reset because libsimlin recreates a zeroed VM; the host is the right place to present that contract. simReset on the wasm path now zeros the live curr chunk (the nSlots f64 at base 0) after reset(), so every by-name read and getTime() return 0 -- matching a freshly-created sim and the VM -- rather than leaking the prior run's tail.

bpowers · 2026-05-23T03:27:26Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2fb132d2ed

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-23T03:33:17Z

+      const rc = entry.wasmExports!.set_value(slot, value);
+      if (rc !== 0) {


Mirror constant overrides into current wasm state

After simSetValue on a wasm sim, this path only calls the blob export and returns. The blob’s set_value mutates the constants region used by future evaluation, but simGetValue reads from the live current chunk, so getValue() can return stale data (e.g., immediately after Model.simulate({overrides}, {engine:'wasm'}) or right after a manual setValue before advancing time). The VM updates current state on override, so this introduces observable VM/wasm API divergence for interactive reads.

Useful? React with 👍 / 👎.

claude · 2026-05-23T03:34:45Z

Code review

Reviewed the full diff with the bytecode VM treated as the correctness oracle (matching the PR's own design). Focus areas: the Rust wasmgen/module.rs resumable-run refactor, the new TS functional cores (internal/wasmgen.ts, internal/canonicalize.ts), the DirectBackend engine demux, and the worker threading.

Findings

No blocking issues found.

Notes from the review (all checked out as correct, not findings):

Resource lifecycle: every native-sim FFI call (run_to/reset/get_value/get_series/unref, etc.) sits behind the entry.engine === 'wasm' early-return, and both simlin_sim_unref sites (direct-backend.ts:298, :529) are guarded — so a wasm sim's ptr === 0 is never passed to libsimlin, and disposal releases the heavy WebAssembly.Instance/layout while keeping the diagnostic tombstone.
wasm.buffer as ArrayBuffer (direct-backend.ts): sound, since copyFromWasm returns a freshly-allocated, exactly-sized, byteOffset-0 Uint8Array, so the backing buffer matches the blob bytes precisely.
canonicalizeIdent: faithful port of common.rs::canonicalize — quote-aware split, literal-period sentinel (U+2024) vs module separator (U+00B7), non-overlapping \\→\ collapse, and whitespace-run handling all match; the UTF-16-unit iteration in replaceWhitespaceWithUnderscore is byte-equivalent to Rust's code-point iteration for the relevant (all-BMP) characters.
Rust cursor refactor: moving saved/step_accum/did_initials into mutable globals keeps L_DST/L_SAVED_TIME/L_RK_S at indices 2/3/4 so the shared per-step emitters are unaffected; run re-expressed as reset; run_to(stop) keeps a single stepping loop. Covered by the always-on corpus parity hook plus the new two-segment run_to parity twins on WORLD3/C-LEARN.
simReset curr-chunk zeroing: validated against the VM fresh-state per variable (wasm-backend.test.ts:557), not asserted as a bare 0.
Known VM-vs-wasm deviations (mid-run_to getValue for non-stock vars; LTM-off Run.links = []) are explicitly tracked as out-of-scope follow-ups (engine: mid-run getValue diverges between VM and wasm for non-stock vars (live curr-chunk overshoot handling) #625, engine: LTM-off Sim.getRun() now returns Run.links=[] instead of populated-scoreless (AC1.3 deviation) #626).

Overall correctness verdict: correct

The change is additive (FFI signature and WasmLayout wire format unchanged; engine defaults to 'vm'), error semantics are explicit with no silent VM fallback, and the wasm path is held to VM parity by tests at the blob, FFI, DirectBackend, Model/Sim, and Web Worker layers. I did not find a defect that would break existing code/tests.

chatgpt-codex-connector Bot reviewed May 23, 2026

View reviewed changes

bpowers added 26 commits May 22, 2026 19:43

doc: document the node VM-vs-wasm eval benchmark

57f0cfb

doc: add test plan for @simlin/engine wasm simulation backend

e1de1c4

doc: add @simlin/engine wasm sim implementation plan and test require…

adcd6cd

…ments

bpowers force-pushed the engine-wasm-sim branch from 26351d5 to e7cbd33 Compare May 23, 2026 02:45

chatgpt-codex-connector Bot reviewed May 23, 2026

View reviewed changes

		const rc = entry.wasmExports!.set_value(slot, value);
		if (rc !== 0) {

Conversation

bpowers commented May 23, 2026

Summary

Design and structure

Benchmark (AC9.1, eval-only, compile excluded)

Test plan

Follow-ups (filed, out of scope here)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

bpowers commented May 23, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

bpowers commented May 23, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented May 23, 2026

Automated review

Overall correctness verdict: correct

Uh oh!

bpowers commented May 23, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented May 23, 2026

Code review

Findings

Overall correctness verdict: correct

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant