Skip to content

Add ensemble / scenario-sweep runner#10

Open
RyadT wants to merge 1 commit into
mainfrom
ryad/ensemble-runner
Open

Add ensemble / scenario-sweep runner#10
RyadT wants to merge 1 commit into
mainfrom
ryad/ensemble-runner

Conversation

@RyadT

@RyadT RyadT commented May 29, 2026

Copy link
Copy Markdown
Contributor

Summary

A parallel runner for many simulations at once, over two axes:

  • replicates — same scenario, different seed → uncertainty bands (the simulator is stochastic, so a single run is a point sample).
  • scenarios — different inputs, e.g. interventions → comparison.

The cross-product gives the headline capability: intervention comparison with confidence bands — e.g. masks-at-day-0 → 479 [3, 1025] vs no-intervention → 16,514 [2,595, 24,547] cumulative infected.

Pure orchestration around the existing run_simulatorno core-simulator changes (so the golden still guards every underlying run). simulator/ensemble.py:

  • ProcessPoolExecutor over (scenario, replicate) tasks; each runs with randseed=True and a reproducible per-task seed (SeedSequence.spawn).
  • Workers reduce the in-memory output to a per-timestep cumulative-infected series (tiny to ship back).
  • Aggregated into per-scenario percentile bands over time + a cross-scenario comparison table.
  • python -m simulator.ensemble runs a baseline-vs-masks demo.

Throughput is memory-bound, not core-bound (important)

Each worker needs the (large) patterns data. Measured — 8 runs, dmp off, 96h:

workers runs/min
1 (serial) 6.2
3 (bounded) 12.7
9 (cores−1) 1.9 ← thrash

Under spawn (macOS dev) each worker re-decompresses the 65MB patterns → too many workers thrash RAM and run slower than serial. So:

  • Under fork (Linux deploy) the read-only data is loaded once and shared via copy-on-write → can use all cores.
  • Under spawn max_workers is capped (default min(cores, 4)).

This is why the win is ~2× on this Mac but should scale closer to core-count on the Linux deploy. Tune max_workers to RAM.

Validated

  • Intervention effect: masks ≪ baseline (above).
  • Stochastic variation: baseline replicates [1, 17297, 24139, 24619] — one seed fizzled, others took off.
  • Reproducible: identical results across worker counts (seeds deterministic).

Test plan

  • python -m simulator.ensemble ... --dmp-mode off — comparison + bands + variation
  • serial vs 3 vs 9 workers throughput characterized
  • Reviewer: confirm on the Linux deploy that fork+COW lets max_workers scale to cores
  • Follow-on: --scenarios <json> for arbitrary intervention sweeps (lib run_ensemble() already takes arbitrary scenarios)

🤖 Generated with Claude Code

simulator/ensemble.py runs many simulations in parallel over two axes:
replicates (same scenario, different seed -> uncertainty bands) and
scenarios (different inputs, e.g. interventions -> comparison). The
cross-product gives intervention comparison WITH confidence bands.

Pure orchestration around the existing run_simulator (no core changes):
a ProcessPoolExecutor runs each (scenario, replicate) with randseed=True
and a reproducible per-task seed (SeedSequence.spawn); workers reduce the
in-memory output to a per-timestep cumulative-infected series; results are
aggregated into per-scenario percentile bands + a cross-scenario table.

Throughput is memory-bound, not core-bound: under spawn (macOS dev) each
worker re-decompresses the large patterns file, so unbounded workers
thrash and run slower than serial. Mitigations: under fork (Linux deploy)
the read-only data is shared via copy-on-write (loaded once in the
parent); under spawn max_workers is capped. Measured (8 runs, dmp off,
96h): serial 6.2 runs/min, 9 workers 1.9 (thrash), 3 workers 12.7.

Validated: masks_0.7 -> 479 [3,1025] vs baseline -> 16514 [2595,24547]
cumulative infected; per-replicate variation confirmed (one seed fizzled
at 1, others reached 24k); identical across worker counts (reproducible).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant