Written using Claude Code
Falcon 0.4.1 — Posterior-predictive forward-modelling flag
A self-contained enhancement, separate from the checkpoint/resume
cluster in falcon_checkpoint_resume_issues.md.
Issue — --forward-model flag to push posterior samples through the simulator graph
Summary
After training, falcon sample posterior produces samples of the
latent parameters only (e.g. z, pw). To validate the fit
(posterior-predictive check) or generate simulated observations
consistent with the posterior, the user currently has to write a
bespoke script that loads the samples, imports model.py, and
threads the latent samples through every downstream simulator node
by hand.
Add a flag — falcon sample posterior --forward-model (or
equivalent on falcon launch) — that, after drawing posterior
samples, runs the deterministic forward simulators on those samples
and saves the resulting observable-level fields alongside the
latents. The user already declared the simulator chain in
config.yml; falcon already has the graph machinery to execute it
in forward order. This issue is about wiring those two pieces
together behind one flag.
Current behaviour (0.4.1)
DeployedGraph.sample_posterior
(falcon/core/deployed_graph.py:683-695) runs
_execute_graph(num_samples, self.graph.backward_order, condition_refs, "sample_posterior"). backward_order only
visits nodes with an estimator — the deterministic forward
simulators (gains, stacked_uv, df, bt, image, …) are
not executed.
- The saved NPZ in
{paths.samples}/posterior/{timestamp}/000000.npz
contains only the latent keys (z.value, pw.value,
*.log_prob). It does not contain image.value, obsx.value,
pk_obs.value, etc.
_execute_graph already supports the forward direction — it's
called with self.graph.forward_order during simulation
(deployed_graph.py:828). What's missing is a path that:
(a) draws posterior samples for latent nodes, then
(b) forward-executes the remaining simulators using those
samples as conditions.
Proposed behaviour
-
New CLI flag on falcon sample posterior:
falcon sample posterior -o outputs/run --forward-model
falcon sample posterior -o outputs/run --forward-model --include-nodes image,obsx,pk_obs
falcon sample posterior -o outputs/run --forward-model --exclude-nodes noiseimage
Equivalent YAML knob so the end-of-launch posterior-sampling
step in falcon launch honours it too:
sample:
posterior:
n: 1500
forward_model: true # default false
forward_model_nodes: [image, obsx, pk_obs] # default: all forward nodes
-
Execution path in DeployedGraph:
- Call existing
sample_posterior(n_samples, observations) to
get posterior_refs (latents only).
- Extract latent values via
_extract_value_refs(posterior_refs)
— same pattern as the proposal-sampling path in
deployed_graph.py:819-820.
- Call
_execute_graph(n_samples, self.graph.forward_order, latent_refs, "sample") to forward-simulate. Override
observed nodes: they should be re-simulated from the
posterior latents, not pinned to the on-disk observation
(that's the whole point of a posterior-predictive check).
- Merge latent and forward-modelled refs into one batch.
-
Output layout:
{paths.samples}/posterior/{timestamp}/000000.npz # latents only (today's behaviour)
{paths.samples}/posterior_predictive/{timestamp}/000000.npz # NEW: latents + forward
Use a separate subdirectory so an existing posterior-only run
isn't silently overwritten. The NPZ contains every node's
value (and log_prob where applicable), keyed by node name —
matching the existing sample-file schema so
falcon.read_samples() works without changes.
-
Filtering via --include-nodes / --exclude-nodes:
- For large simulator chains (this repo's graph has ~15
downstream nodes), the full NPZ can be GB-scale. Default
to all forward nodes; let users prune.
- Always include the latent nodes (
z, pw) regardless of
filter, so the NPZ is self-describing.
-
Reproducibility: forward simulators may use RNGs (this
repo's Noise class does). Accept an optional --seed so
posterior-predictive runs are reproducible.
-
Console output:
✓ Drew 1500 posterior samples (latents: z, pw)
↻ Forward-modelling through 12 simulator nodes...
✓ Saved posterior + forward-modelled samples to outputs/run/samples_dir/posterior_predictive/2026-05-26T15-30/
Acceptance criteria
falcon sample posterior -o <run> --forward-model produces an
NPZ in posterior_predictive/ containing keys for every node in
the graph (or the filtered subset).
- Loading via
falcon.read_samples(<run>, kind="posterior_predictive")
returns a dict-like object indexed by node name.
- Observed nodes (
V_ref, pk_obs in this repo) appear in the
output with values re-simulated from posterior latents — not
pinned to the observation NPZ. A test asserts that
samples["pk_obs"] varies across draws.
--include-nodes / --exclude-nodes and the YAML equivalent
filter the saved set as specified.
--seed makes two consecutive runs bit-identical (modulo
ordering across Ray actors — assert mean and std match
exactly for deterministic nodes).
- A unit test on
01_minimal with --forward-model asserts the
NPZ contains both latent and observation-level keys.
Out of scope
- Posterior-predictive p-values / diagnostic plotting — that's
a downstream notebook concern.
- Conditional posterior-predictive (e.g. fixing one latent and
forward-modelling the others) — single flag, single mode.
- Streaming / chunked output for very large graphs — initial
implementation writes one NPZ per run; can be revisited if it
becomes a memory issue.
Why this is worth it
The current workaround is to copy the relevant model.py classes
into a notebook and re-instantiate them with the right config
constants. That's brittle: every change to model.py has to be
mirrored in the analysis notebook, and any device/dtype subtlety
(e.g. the complex64 casting in createVreffromgains) is easy
to get wrong on the analysis side. Putting forward modelling
behind a flag means the same simulator code that trained the
estimator also validates it — no parallel implementation.
Written using Claude Code
Falcon 0.4.1 — Posterior-predictive forward-modelling flag
A self-contained enhancement, separate from the checkpoint/resume
cluster in
falcon_checkpoint_resume_issues.md.Issue —
--forward-modelflag to push posterior samples through the simulator graphSummary
After training,
falcon sample posteriorproduces samples of thelatent parameters only (e.g.
z,pw). To validate the fit(posterior-predictive check) or generate simulated observations
consistent with the posterior, the user currently has to write a
bespoke script that loads the samples, imports
model.py, andthreads the latent samples through every downstream simulator node
by hand.
Add a flag —
falcon sample posterior --forward-model(orequivalent on
falcon launch) — that, after drawing posteriorsamples, runs the deterministic forward simulators on those samples
and saves the resulting observable-level fields alongside the
latents. The user already declared the simulator chain in
config.yml; falcon already has the graph machinery to execute itin forward order. This issue is about wiring those two pieces
together behind one flag.
Current behaviour (0.4.1)
DeployedGraph.sample_posterior(
falcon/core/deployed_graph.py:683-695) runs_execute_graph(num_samples, self.graph.backward_order, condition_refs, "sample_posterior").backward_orderonlyvisits nodes with an estimator — the deterministic forward
simulators (
gains,stacked_uv,df,bt,image, …) arenot executed.
{paths.samples}/posterior/{timestamp}/000000.npzcontains only the latent keys (
z.value,pw.value,*.log_prob). It does not containimage.value,obsx.value,pk_obs.value, etc._execute_graphalready supports the forward direction — it'scalled with
self.graph.forward_orderduring simulation(deployed_graph.py:828). What's missing is a path that:
(a) draws posterior samples for latent nodes, then
(b) forward-executes the remaining simulators using those
samples as conditions.
Proposed behaviour
New CLI flag on
falcon sample posterior:Equivalent YAML knob so the end-of-launch posterior-sampling
step in
falcon launchhonours it too:Execution path in
DeployedGraph:sample_posterior(n_samples, observations)toget
posterior_refs(latents only)._extract_value_refs(posterior_refs)— same pattern as the proposal-sampling path in
deployed_graph.py:819-820._execute_graph(n_samples, self.graph.forward_order, latent_refs, "sample")to forward-simulate. Overrideobserved nodes: they should be re-simulated from the
posterior latents, not pinned to the on-disk observation
(that's the whole point of a posterior-predictive check).
Output layout:
Use a separate subdirectory so an existing posterior-only run
isn't silently overwritten. The NPZ contains every node's
value(andlog_probwhere applicable), keyed by node name —matching the existing sample-file schema so
falcon.read_samples()works without changes.Filtering via
--include-nodes/--exclude-nodes:downstream nodes), the full NPZ can be GB-scale. Default
to all forward nodes; let users prune.
z,pw) regardless offilter, so the NPZ is self-describing.
Reproducibility: forward simulators may use RNGs (this
repo's
Noiseclass does). Accept an optional--seedsoposterior-predictive runs are reproducible.
Console output:
Acceptance criteria
falcon sample posterior -o <run> --forward-modelproduces anNPZ in
posterior_predictive/containing keys for every node inthe graph (or the filtered subset).
falcon.read_samples(<run>, kind="posterior_predictive")returns a dict-like object indexed by node name.
V_ref,pk_obsin this repo) appear in theoutput with values re-simulated from posterior latents — not
pinned to the observation NPZ. A test asserts that
samples["pk_obs"]varies across draws.--include-nodes/--exclude-nodesand the YAML equivalentfilter the saved set as specified.
--seedmakes two consecutive runs bit-identical (moduloordering across Ray actors — assert
meanandstdmatchexactly for deterministic nodes).
01_minimalwith--forward-modelasserts theNPZ contains both latent and observation-level keys.
Out of scope
a downstream notebook concern.
forward-modelling the others) — single flag, single mode.
implementation writes one NPZ per run; can be revisited if it
becomes a memory issue.
Why this is worth it
The current workaround is to copy the relevant
model.pyclassesinto a notebook and re-instantiate them with the right config
constants. That's brittle: every change to
model.pyhas to bemirrored in the analysis notebook, and any device/dtype subtlety
(e.g. the
complex64casting increateVreffromgains) is easyto get wrong on the analysis side. Putting forward modelling
behind a flag means the same simulator code that trained the
estimator also validates it — no parallel implementation.