Context
A post-#599 perf profile of the C-LEARN run (160 graphical functions / ~36k points; ~780 Lookup opcodes × 1000 steps ≈ 780k lookups/run) shows the graphical-function lookup path is ~4% of run time, split across vm::lookup (~2.3%), float_cmp::approx_eq (~1.2%), and round/pow (~1%).
Idea
Optimize the per-lookup interpolation:
- Investigate why
approx_eq (ULP-based f64 compare) and round sit on the hot lookup path — a plain </clamp comparison may suffice for the in-range/segment selection.
- For large tables, binary-search the x-breakpoints instead of a linear scan (C-LEARN's historical tables are year-indexed and large).
- Hoist invariant per-table work out of the per-step lookup.
Bit-preserving where possible; otherwise gated by the simulate suite's ~1% cross-simulator tolerance.
Expected impact
~2-4% of the C-LEARN run. Incremental, but clean and localized to the array/lookup machinery.
Refs
Context
A post-#599
perfprofile of the C-LEARN run (160 graphical functions / ~36k points; ~780Lookupopcodes × 1000 steps ≈ 780k lookups/run) shows the graphical-function lookup path is ~4% of run time, split acrossvm::lookup(~2.3%),float_cmp::approx_eq(~1.2%), andround/pow(~1%).Idea
Optimize the per-lookup interpolation:
approx_eq(ULP-based f64 compare) androundsit on the hot lookup path — a plain</clamp comparison may suffice for the in-range/segment selection.Bit-preserving where possible; otherwise gated by the
simulatesuite's ~1% cross-simulator tolerance.Expected impact
~2-4% of the C-LEARN run. Incremental, but clean and localized to the array/lookup machinery.
Refs
src/simlin-engine/src/vm.rs—lookup()and theOpcode::Lookup/Opcode::LookupArrayhandlers.src/simlin-engine/src/float.rs—approx_eq.docs/design/engine-performance.mdfor the profiling methodology.