Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
5f9580a
engine: add resumable run_to/run_initials ABI to wasmgen blob
bpowers May 22, 2026
3c7476e
engine: parity-test segmented and clamped wasm run_to vs VM
bpowers May 22, 2026
53f2eec
engine: parity-test wasm reset (defaults + override-preserving) vs VM
bpowers May 22, 2026
eda3883
engine: parity-test mid-run wasm set_value and non-constant rejection
bpowers May 22, 2026
4c63026
libsimlin: exercise resumable wasm exports across the FFI compile path
bpowers May 22, 2026
6685568
engine: segmented run_to parity for WORLD3 and C-LEARN wasm twins
bpowers May 22, 2026
e6c3322
doc: note resumable wasm run ABI in simlin-engine module map
bpowers May 22, 2026
c04395f
engine: add wasmgen FFI wrapper + pure layout parser and strided read
bpowers May 22, 2026
f246456
engine: add Rust-faithful canonicalize for wasm name resolution
bpowers May 22, 2026
6cd2414
engine: wasm-engine sim creation and disposal in DirectBackend
bpowers May 22, 2026
45b31e7
engine: per-op vm/wasm demux in DirectBackend with VM parity
bpowers May 22, 2026
7e0d9af
engine: thread engine selection through Model/Sim and gate getRun links
bpowers May 22, 2026
b35173f
engine: add end-to-end wasm name-resolution and facade parity tests
bpowers May 22, 2026
c2373a6
engine: format Phase 2 wasm-backend source and test files
bpowers May 22, 2026
0388270
engine: thread optional engine field through the worker simNew message
bpowers May 22, 2026
3effbdf
engine: parity-test the wasm engine through the Web Worker path
bpowers May 22, 2026
b801c6f
engine: pure benchmark harness (median + adaptive warmup/measure)
bpowers May 22, 2026
12fd4e8
engine: node VM-vs-wasm eval benchmark for fishbanks/WORLD3/C-LEARN
bpowers May 22, 2026
57f0cfb
doc: document the node VM-vs-wasm eval benchmark
bpowers May 22, 2026
296d3ee
engine: harden benchmark seriesClose NaN handling and tighten harness…
bpowers May 22, 2026
378ec1d
doc: document wasm engine selection in @simlin/engine
bpowers May 23, 2026
e1de1c4
doc: add test plan for @simlin/engine wasm simulation backend
bpowers May 23, 2026
adcd6cd
doc: add @simlin/engine wasm sim implementation plan and test require…
bpowers May 23, 2026
6ba5ab9
engine: report completed-step count for wasm sims via saved-steps export
bpowers May 23, 2026
c69539b
engine: release wasm sim instance refs on dispose to bound memory
bpowers May 23, 2026
e7cbd33
engine: truncate wasm getSeries to completed steps
bpowers May 23, 2026
2fb132d
engine: restore fresh curr state on wasm sim reset
bpowers May 23, 2026
90aea2d
engine: reflect wasm setValue override in live curr state
bpowers May 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
- [design-plans/2026-05-22-engine-wasm-sim.md](design-plans/2026-05-22-engine-wasm-sim.md) -- Integrate the wasm backend into `@simlin/engine` as a selectable engine (`Model.simulate({engine:'wasm'})`): vm-vs-wasm demux below the `Sim` facade in `DirectBackend`, a resumable blob run ABI for `runTo`, and a node VM-vs-wasm benchmark; 4 phases
- [plans/](plans/README.md) -- Implementation plans (active and completed)
- [test-plans/](test-plans/) -- Human verification plans for completed features
- [test-plans/2026-05-22-engine-wasm-sim.md](test-plans/2026-05-22-engine-wasm-sim.md) -- Manual verification for the `@simlin/engine` selectable wasm engine (`Model.simulate({engine:'wasm'})`): re-running the automated gates, driving the gated/`#[ignore]`d heavy tests, and the human-judged extras (interactive scrubbing feel, VM-vs-wasm benchmark numbers); all 25 ACs already have automated coverage
- `implementation-plans/` -- Detailed phase-by-phase implementation plans, created during plan execution

## Security
Expand Down
37 changes: 33 additions & 4 deletions docs/dev/benchmarks.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,11 @@ Criterion saves results in `target/criterion/` and generates HTML reports in `ta

## Benchmark suites

| Suite | File | What it measures |
|-------|------|------------------|
| `compiler` | `benches/compiler.rs` | End-to-end compiler pipeline on real models (WRLD3, C-LEARN) |
| Suite | File | What it measures |
| ------------ | ----------------------- | ----------------------------------------------------------------- |
| `compiler` | `benches/compiler.rs` | End-to-end compiler pipeline on real models (WRLD3, C-LEARN) |
| `simulation` | `benches/simulation.rs` | VM execution, slider interaction, compilation of synthetic models |
| `array_ops` | `benches/array_ops.rs` | Array sum, element-wise add, broadcasting, multi-ref |
| `array_ops` | `benches/array_ops.rs` | Array sum, element-wise add, broadcasting, multi-ref |

### compiler benchmarks

Expand All @@ -38,11 +38,40 @@ The `compiler` suite measures each stage of the compilation pipeline independent
- **`full_pipeline`** — all stages end-to-end

Models used:

- `wrld3` — World3 model (151 KB, ~3,800 lines), a classic system dynamics model
- `clearn` — C-LEARN climate model (1.4 MB, ~53,000 lines), a stress test for the compiler

C-LEARN currently uses builtins that are not yet implemented in the bytecode compiler, so it is automatically skipped for `bytecode_compile` and `full_pipeline`. It still participates in `parse_mdl` and `project_build`, which are the most allocation-heavy stages.

## Node VM-vs-wasm eval benchmark

`@simlin/engine` can run a model on two backends: the libsimlin VM or a compiled WebAssembly blob. This benchmark compares their **simulation (eval) time** through the public `Model.simulate({ engine })` API, on fishbanks, WORLD3, and C-LEARN.

It is a [jest](https://jestjs.io/) test gated behind `RUN_BENCH` so it stays out of the default `pnpm test` (a full C-LEARN run on both engines exceeds the per-test time budget):

```bash
# Run all three models on both engines
RUN_BENCH=1 pnpm -C src/engine exec jest backend-bench

# Subset the models (comma-separated: fishbanks, wrld3, clearn)
RUN_BENCH=1 BENCH_MODELS=fishbanks,wrld3 pnpm -C src/engine exec jest backend-bench
```

It prints a markdown table of the warm **median** eval time per engine plus the wasm/VM ratio.

What it measures, and what it deliberately excludes:

- **Eval only.** The `Sim` for each `(model, engine)` is built once in untimed setup; for wasm that one-time cost is the blob compile and instantiate. Each measured iteration is a `reset()` (also untimed) followed by a timed `runToEnd()`. Result extraction (`getRun`/`getSeries`) is not timed.
- **Median over an explicit warmup.** A discard-only warmup runs first, then the harness collects timings adaptively (until a max iteration count or a per-model wall-clock budget) and reports the median. The pure stats/harness lives in `src/engine/tests/bench-stats.ts` and is always-on unit-tested.
- **Cross-checked before trusted.** Before timing, the benchmark runs each model on both engines and compares a representative series within the engine's tolerance, so a broken run can't masquerade as a fast one.

Absolute numbers include the async public-API overhead, so the VM/wasm ratio is the figure to compare across runs.

The Rust counterpart is `src/simlin-engine/examples/backend_bench.rs`, which uses the same eval-vs-eval methodology and median statistic against the lower-level `Vm`/wasm interfaces.

Results are reported in the PR or chat, not committed: the harness is regenerable, but checked-in numbers go stale and mislead. Do not add a results file.

## Profiling

### Build a benchmark binary for profiling
Expand Down
Loading
Loading