engine: bundle marginal branch/dispatch reductions to test for a CPU predictor threshold

## Context

The C-LEARN VM run is branch-mispredict-bound (~68% of branch-misses are the `eval_bytecode` dispatch indirect branch; IPC ~3.3). During the PR #599 campaign, an instruction-count-reducing change (the `to_runtime_view` memcpy, −4.2% retired instructions) produced **no wall-clock movement** — the out-of-order core absorbed the freed instructions in spare IPC. The same was true of bounds-check elimination when investigated alone (`docs/design/engine-performance.md` R1: sub-noise at `opt-level=3`).

## Idea

Treat individually-sub-noise branch/dispatch reductions as **synergy candidates**, not discards. The dispatch indirect branch's target-history working set likely sits just above the BTB/predictor capacity, so cumulatively shrinking the program's distinct branch sites / dispatches may cross a threshold for a **non-linear** wall-clock win that none of the changes show alone.

Concretely:
- Maintain a set of marginal reductions (more superinstructions, removing conditional arms from the hot loop, view-validity branch hoisting, etc.).
- Measure them as a **bundle** with `perf stat -e instructions,cycles,branches,branch-misses,L1-icache-load-misses` — watch the branch-miss rate and IPC, not just wall-clock.
- **Re-evaluate bounds-check elimination in the bundled context.** It is sub-noise alone (R1), but it removes ~127 `panic_bounds_check` branch sites from `eval_bytecode`; it may contribute to a threshold tip. (Would require `unsafe get_unchecked` + a static validation pass — see R1 — so it is gated on that decision; not pursued solo.)

## Refs
- `docs/design/engine-performance.md` — R1 (bounds-check elimination, sub-noise alone), R3 (dispatch).
- PR #599 — the fusion wins reduced branch-misses (−0.95%, −3.8%), evidence that dispatch-count reduction does move the branch-miss needle (so a bundle plausibly compounds).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

engine: bundle marginal branch/dispatch reductions to test for a CPU predictor threshold #604

Context

Idea

Refs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

engine: bundle marginal branch/dispatch reductions to test for a CPU predictor threshold #604

Description

Context

Idea

Refs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions