perf: fuse fvar substitution into instantiateMVars by nomeata · Pull Request #12233 · leanprover/lean4

nomeata · 2026-01-29T15:17:32Z

This PR adds instantiateAllMVars and instantiateAllMVarsSharing, two-pass variants of instantiateMVars for call sites where all metavariables are known to be assigned. They fail fast (returning none) on unassigned metavariables instead of leaving them in place or calling the expensive elimMVarDeps machinery.

Two-pass architecture

Pass 1 (instantiate_direct_fn): Standard instantiateMVars-like traversal that resolves direct mvar assignments with write-back and a single persistent cache. For delayed assignments, it pre-normalizes the pending value (resolving its direct chain) but leaves the delayed mvar application in the expression.
Pass 2 (instantiate_delayed_fn): Fused traversal that resolves delayed assignments by carrying an fvar substitution map. When encountering ?m [x₁,…,xₖ] := ?pending, it extends the substitution with [xᵢ ↦ argᵢ] and continues into the pending value. Since pass 1 has pre-normalized all direct chains, each pending value is compact and visited once.

Sharing preservation (`instantiateAllMVarsSharing`)

The fused fvar-substitution approach can lose sharing across delayed-mvar boundaries: when a shared subexpression (like @Eq Nat s s) appears both inside a delayed mvar's value and in a type argument, the cache scope swap at visit_delayed entry causes each occurrence to independently produce the same result, yielding distinct objects.

instantiateAllMVarsSharing solves this with a scope-tracked cache stack:

Each visit_delayed call pushes a new cache scope, and records the scope level in each fvar substitution entry
m_result_scope tracks the maximum fvar-scope used by each subtree (via save/reset/restore in visit())
Results are cached at the outermost valid scope level (determined by m_result_scope), so entries that only depend on outer-scope fvars persist when inner scopes are popped
Lookup scans the stack from outermost to innermost (typically 1-3 levels deep)

This achieves O(n²) output sharing matching instantiateMVars, vs O(n³) without sharing preservation.

Benchmark results (pathological nested-delayed-mvar case)

Method	n=50	n=100	n=200
allMVarsSharing	0ms / 2,030 objs	0ms / 6,555 objs	1ms / 23,105 objs
instantiateMVars	1ms / 1,981 objs	5ms / 6,456 objs	37ms / 22,906 objs
allMVars (no sharing)	1ms / 50,932 objs	12ms / 369,357 objs	92ms / 2,808,707 objs
noUpdateLean	3ms / 50,932 objs	33ms / 369,357 objs	257ms / 2,808,707 objs
noUpdate (C++)	25ms / 50,932 objs	420ms / 369,357 objs	7,336ms / 2,808,707 objs

Files

src/kernel/instantiate_mvars_all.cpp — C++ implementation of both passes
src/Lean/Meta/InstMVarsAll.lean — Lean wrappers for instantiateAllMVars and instantiateAllMVarsSharing
tests/lean/run/instantiateAllMVarsSharing.lean — Minimal test verifying sharing behavior of both variants

🤖 Generated with Claude Code

(TODO)

nomeata · 2026-01-29T15:17:49Z

!radar

leanprover-radar · 2026-01-29T15:18:04Z

Benchmark results for c83f527 against 9e18eea are in! @nomeata

🟥 build//instructions: +4.7G (+0.0%)

Large changes (1✅, 2🟥)

🟥 big_struct//instructions: +3.0G (+15.0%)
✅ big_struct_dep//instructions: -3.8G (-18.4%)
🟥 big_struct_dep1//instructions: +9.3G (+22.7%)

Small changes (7🟥)

🟥 build/module/Init.Grind.Config//instructions: +53.5M (+2.8%) (reduced significance based on absolute threshold)
🟥 build/module/Init.MetaTypes//instructions: +35.6M (+1.6%)
🟥 build/module/Lake.Config.Package//instructions: +12.8M (+0.5%)
🟥 build/module/Lake.Config.PackageConfig//instructions: +36.0M (+0.6%)
🟥 build/module/Lean.Meta.Tactic.Grind.Arith.CommRing.Types//instructions: +25.2M (+0.9%)
🟥 build/module/Lean.Meta.Tactic.Grind.Arith.Cutsat.ToIntInfo//instructions: +16.3M (+0.9%)
🟥 build/module/Lean.Meta.Tactic.Grind.Arith.Linear.Types//instructions: +80.9M (+1.5%)

leanprover-bot · 2026-01-29T17:27:27Z

Reference manual CI status:

❗ Reference manual CI can not be attempted yet, as the nightly-testing-2026-01-28 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-manual, reference manual CI should run now. You can force reference manual CI using the force-manual-ci label. (2026-01-29 17:27:26)
❗ Reference manual CI can not be attempted yet, as the nightly-testing-2026-02-25 tag does not exist there yet. We will retry when you push more commits. If you rebase your branch onto nightly-with-manual, reference manual CI should run now. You can force reference manual CI using the force-manual-ci label. (2026-02-28 00:29:59)

ghost · 2026-01-29T18:29:39Z

Mathlib CI status (docs):

✅ Mathlib branch lean-pr-testing-12233 has successfully built against this PR. (2026-01-29 18:29:37) View Log

…date This adds an experimental `instantiateMVarsNoUpdate2` variant (inline in the benchmark) that keeps its expression cache across binder depths using bvar-blind keying, rather than clearing the cache at each depth. Bench2 shows it avoids the O(depth * bodySize) scaling of the old approach when the same closed sub-expression appears at many depths (crossover ~depth=50). However, the ~6ms constant per-node overhead makes it 3-5x slower on the bench1 workload (the realistic delayed-assignment case), where both variants are already sub-millisecond. Not worth adopting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This PR adds a C++ implementation of `instantiateMVarsNoUpdate`, following the pattern of the existing `instantiate_mvars.cpp`. Key features: - FVar substitution via hash map instead of `replace_fvars`, avoiding a separate traversal pass for delayed assignments - Depth tracking with lazy lifting of substitution values - Two-level cache: a depth-dependent cache (cleared under binders) and a global cache for fvar-free expressions that persists across binder depths - When the fvar substitution is empty, updates mvar assignments with their normalized values so subsequent `instantiateMVars` calls are free Benchmark (delayed-assignment workload): - C++ NoUpdate is ~2x faster than the Lean NoUpdate implementation - C++ NoUpdate is 10-26x faster than standard instantiateMVars - The global cache eliminates O(depth) scaling for fvar-free expressions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This PR makes instantiateMVarsNoUpdate the default implementation behind lean_instantiate_expr_mvars. The NoUpdate variant carries a free variable substitution during traversal instead of doing repeated replace_fvars, avoiding quadratic behavior in terms with many delayed-assigned metavariables. Key fixes for drop-in compatibility: - Use name_hash_map for fvar substitution (structural name equality, not pointer comparison) - Always call apply_beta with preserve_data=false for regular mvar assignments to strip mdata wrappers like _recApp - Clear depth-dependent cache when extending fvar substitution in visit_delayed - Handle unassigned mvars with active fvar substitution by creating delayed assignments via elimMVarDeps - Match standard instantiateMVars guards for delayed assignment resolution Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

nomeata · 2026-02-27T23:08:50Z

!radar

leanprover-radar · 2026-02-27T23:08:56Z

Benchmark results for c5d6267 against 9e18eea are in! @nomeata

Warning

These warnings may indicate that the benchmark results are not directly comparable, for example due to changes in the runner configuration or hardware.

Bench repo commit hashes for run build differ between commits.
Bench repo commit hashes for run other differ between commits.

🟥 build//instructions: +87.5G (+0.67%)

Large changes (7🟥)

🟥 build/module/Std.Data.Iterators.Lemmas.Combinators.Monadic.FilterMap//instructions: +5.1G (+88.93%)
🟥 build/module/Std.Tactic.BVDecide.Bitblast.BVExpr.Circuit.Lemmas.Expr//instructions: +8.6G (+6.71%)
🟥 elab/big_struct//instructions: +10.7G (+53.17%)
🟥 elab/big_struct_dep//instructions: +4.1G (+19.94%)
🟥 elab/big_struct_dep1//instructions: +20.1G (+48.89%)
🟥 elab/grind_bitvec2//instructions: +3.4G (+1.54%)
🟥 elab/omega_stress//instructions: +259.9M (+5.04%)

Medium changes (19🟥)

🟥 build/module/Init.Data.Iterators.Lemmas.Combinators.Monadic.FilterMap//instructions: +4.6G (+9.62%) (reduced significance based on absolute threshold)
🟥 build/module/Init.Data.Range.Polymorphic.RangeIterator//instructions: +3.2G (+13.85%) (reduced significance based on absolute threshold)
🟥 build/module/Init.Data.String.Decode//instructions: +1.1G (+3.20%) (reduced significance based on absolute threshold)
🟥 build/module/Std.Data.DTreeMap.Internal.Balancing//instructions: +2.1G (+2.61%)
🟥 build/module/Std.Data.DTreeMap.Internal.Model//instructions: +1.6G (+2.24%)
🟥 build/module/Std.Data.DTreeMap.Internal.Operations//instructions: +1.5G (+2.96%) (reduced significance based on absolute threshold)
🟥 build/module/Std.Data.DTreeMap.Internal.WF.Lemmas//instructions: +1.4G (+2.93%)
🟥 build/module/Std.Data.Internal.List.Associative//instructions: +1.1G (+1.21%)
🟥 build/module/Std.Data.Iterators.Lemmas.Combinators.Zip//instructions: +4.2G (+38.48%) (reduced significance based on absolute threshold)
🟥 build/module/Std.Tactic.BVDecide.Bitblast.BVExpr.Circuit.Impl.Expr//instructions: +1.8G (+2.43%) (reduced significance based on absolute threshold)
🟥 build/module/Std.Tactic.BVDecide.LRAT.Internal.Formula.RupAddResult//instructions: +2.1G (+3.77%) (reduced significance based on absolute threshold)
🟥 build/module/Std.Tactic.BVDecide.LRAT.Internal.Formula.RupAddSound//instructions: +1.3G (+6.14%) (reduced significance based on absolute threshold)
🟥 elab/big_match_partial//instructions: +137.2M (+0.76%)
🟥 elab/bv_decide_rewriter//instructions: +91.0M (+0.77%)
🟥 elab/grind_list2//instructions: +563.6M (+0.92%)
🟥 elab/mut_rec_wf//instructions: +418.6M (+1.36%)
🟥 misc/import Init.Data.BitVec.Lemmas//instructions: +819.4M (+0.56%)
🟥 misc/import Init.Data.List.Sublist//instructions: +191.2M (+1.66%)
🟥 misc/import Std.Data.Internal.List.Associative//instructions: +1.1G (+1.38%)

Small changes (1✅, 212🟥)

Too many entries to display here. View the full report on radar instead.

nomeata · 2026-02-27T23:14:17Z

!radar

leanprover-radar · 2026-02-27T23:14:23Z

Benchmark results for ee068d5 against 87ec768 are in! @nomeata

🟥 build exited with code -1
🟥 other exited with code -1

No significant changes detected.

…lean4 into joachim/instantiateMVarsNoUpdate

nomeata · 2026-02-27T23:17:15Z

!radar

leanprover-radar · 2026-02-27T23:17:21Z

Benchmark results for ae9230f against 87ec768 are in! @nomeata

🟥 build//instructions: +95.9G (+0.76%)

Large changes (8🟥)

🟥 build/module/Init.Data.Iterators.Lemmas.Combinators.Monadic.FilterMap//instructions: +5.0G (+10.94%)
🟥 build/module/Init.Data.String.Lemmas.Pattern.String.ForwardSearcher//instructions: +7.5G (+38.64%)
🟥 build/module/Std.Data.Iterators.Lemmas.Combinators.Monadic.FilterMap//instructions: +5.2G (+95.78%)
🟥 build/module/Std.Tactic.BVDecide.Bitblast.BVExpr.Circuit.Lemmas.Expr//instructions: +8.5G (+7.46%)
🟥 elab/big_struct_dep//instructions: +4.1G (+19.96%)
🟥 elab/big_struct_dep1//instructions: +872.4M (+13.71%)
🟥 elab/grind_bitvec2//instructions: +3.5G (+1.72%)
🟥 elab/omega_stress//instructions: +258.0M (+5.69%)

Medium changes (21🟥)

Too many entries to display here. View the full report on radar instead.

Small changes (1✅, 207🟥)

Too many entries to display here. View the full report on radar instead.

mathlib-lean-pr-testing · 2026-02-28T01:22:02Z

Mathlib CI status (docs):

💥 Mathlib branch lean-pr-testing-12233 build failed against this PR. (2026-02-28 01:22:00) View Log

…gned This PR adds `instantiateAllMVars`, a variant of `instantiateMVars` that fails fast when encountering an unassigned mvar under an active fvar substitution, instead of calling the expensive `elimMVarDeps` machinery. Use it at call sites where all mvars are known to be assigned. Reverts the drop-in replacement of `instantiateMVars` with `instantiateMVarsNoUpdate` and instead targets `instantiateMVarsAtPreDecls` as the primary call site. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

nomeata · 2026-02-28T09:55:04Z

!bench

nomeata · 2026-02-28T12:04:17Z

!radar

leanprover-radar · 2026-02-28T12:04:27Z

Benchmark results for 2c6ee17 against 87ec768 are in! @nomeata

🟥 build//instructions: +5.3G (+0.04%)

Small changes (1🟥)

🟥 build/module/Lean.MetavarContext//instructions: +76.4M (+0.70%)

…AllMVars Rewrite instantiateAllMVars as a two-pass algorithm: - Pass 1 resolves direct mvar assignments with write-back caching - Pass 2 resolves delayed assignments with fused fvar substitution Add instantiateAllMVarsSharing variant that uses a stack of (ptr, depth) caches with scope tracking to preserve sharing across visit_delayed boundaries. When lookup_fvar substitutes a fvar, it records which scope added it. Results are cached at the outermost valid scope level, so they persist when inner scopes are popped. This achieves O(n²) output sharing (matching standard instantiateMVars) vs O(n³) without sharing preservation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…riginal Make the two-pass instantiateAllMVars/instantiateAllMVarsSharing handle unassigned mvars (via abstract_mvar_fvars/elimMVarDeps) instead of failing. This makes them usable as a drop-in replacement at call sites where mvars may not all be assigned, though the unassigned-mvar behavior differs from the standard implementation (lambda-wrapping vs beta-reduction). Also expose instantiateMVarsOriginal for benchmarking independently of which implementation is the default. Note: NOT yet wired as default instantiateMVars — the behavioral difference for unassigned mvars causes test failures (e.g. meta7.lean). The redirect is reverted; these remain as separate API functions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mathlib-lean-pr-testing · 2026-03-02T12:21:01Z

Mathlib CI status (docs):

✅ Mathlib branch lean-pr-testing-12233 has successfully built against this PR. (2026-03-02 12:20:58) View Log

mathlib-lean-pr-testing · 2026-03-02T13:18:54Z

Mathlib CI status (docs):

✅ Mathlib branch lean-pr-testing-12233 has successfully built against this PR. (2026-03-02 13:18:52) View Log

perf: a non-updating single-subst instantiateMVar

c83f527

(TODO)

nomeata added the changelog-language Language features and metaprograms label Jan 29, 2026

nomeata added 4 commits January 29, 2026 15:28

Please CI

946773f

Docstring

3bbb693

File from wrong branch

b133b76

Empty lines

293e980

github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label Jan 29, 2026

ghost pushed a commit to leanprover-community/batteries that referenced this pull request Jan 29, 2026

Update lean-toolchain for testing leanprover/lean4#12233

69e3ffc

github-actions bot added the mathlib4-nightly-available A branch for this PR exists at leanprover-community/mathlib4-nightly-testing:lean-pr-testing-NNNN label Jan 29, 2026

ghost pushed a commit to leanprover-community/mathlib4-nightly-testing that referenced this pull request Jan 29, 2026

Update lean-toolchain for testing leanprover/lean4#12233

411f145

ghost added the builds-mathlib CI has verified that Mathlib builds against this PR label Jan 29, 2026

nomeata and others added 2 commits February 26, 2026 23:23

nomeata force-pushed the joachim/instantiateMVarsNoUpdate branch from 267f6b4 to 860b8fd Compare February 27, 2026 10:50

Undo changes to Injective

0cc92df

Merge branch 'nightly-with-mathlib' of https://github.com/leanprover/…

ae9230f

…lean4 into joachim/instantiateMVarsNoUpdate

nomeata force-pushed the joachim/instantiateMVarsNoUpdate branch from ee068d5 to ae9230f Compare February 27, 2026 23:17

nomeata changed the title ~~perf: a non-updating single-subst instantiateMVar~~ perf: fuse fvar substitution into instantiateMVars Feb 27, 2026

mathlib-nightly-testing bot pushed a commit to leanprover-community/batteries that referenced this pull request Feb 28, 2026

Update lean-toolchain for leanprover/lean4#12233

70fe269

mathlib-nightly-testing bot pushed a commit to leanprover-community/mathlib4-nightly-testing that referenced this pull request Feb 28, 2026

Update lean-toolchain for leanprover/lean4#12233

1bed893

mathlib-lean-pr-testing bot removed the builds-mathlib CI has verified that Mathlib builds against this PR label Feb 28, 2026

mathlib-lean-pr-testing bot added the breaks-mathlib This is not necessarily a blocker for merging: but there needs to be a plan label Feb 28, 2026

mathlib-nightly-testing bot pushed a commit to leanprover-community/batteries that referenced this pull request Mar 2, 2026

Update lean-toolchain for leanprover/lean4#12233

b785b9f

mathlib-nightly-testing bot pushed a commit to leanprover-community/mathlib4-nightly-testing that referenced this pull request Mar 2, 2026

Update lean-toolchain for leanprover/lean4#12233

9229319

mathlib-lean-pr-testing bot added builds-mathlib CI has verified that Mathlib builds against this PR and removed breaks-mathlib This is not necessarily a blocker for merging: but there needs to be a plan labels Mar 2, 2026

mathlib-nightly-testing bot pushed a commit to leanprover-community/batteries that referenced this pull request Mar 2, 2026

Update lean-toolchain for leanprover/lean4#12233

b39fb62

mathlib-nightly-testing bot pushed a commit to leanprover-community/mathlib4-nightly-testing that referenced this pull request Mar 2, 2026

Update lean-toolchain for leanprover/lean4#12233

a8283ec

Conversation

nomeata commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Two-pass architecture

Sharing preservation (instantiateAllMVarsSharing)

Benchmark results (pathological nested-delayed-mvar case)

Files

Uh oh!

nomeata commented Jan 29, 2026

Uh oh!

leanprover-radar commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leanprover-bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost commented Jan 29, 2026

Uh oh!

nomeata commented Feb 27, 2026

Uh oh!

leanprover-radar commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nomeata commented Feb 27, 2026

Uh oh!

leanprover-radar commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nomeata commented Feb 27, 2026

Uh oh!

leanprover-radar commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mathlib-lean-pr-testing bot commented Feb 28, 2026

Uh oh!

nomeata commented Feb 28, 2026

Uh oh!

nomeata commented Feb 28, 2026

Uh oh!

leanprover-radar commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mathlib-lean-pr-testing bot commented Mar 2, 2026

Uh oh!

mathlib-lean-pr-testing bot commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nomeata commented Jan 29, 2026 •

edited

Loading

Sharing preservation (`instantiateAllMVarsSharing`)

leanprover-radar commented Jan 29, 2026 •

edited

Loading

leanprover-bot commented Jan 29, 2026 •

edited

Loading

leanprover-radar commented Feb 27, 2026 •

edited

Loading

leanprover-radar commented Feb 27, 2026 •

edited

Loading

leanprover-radar commented Feb 27, 2026 •

edited

Loading

leanprover-radar commented Feb 28, 2026 •

edited

Loading