Skip to content

Add DAG-based kernel typechecker#300

Open
johnchandlerburnham wants to merge 15 commits intomainfrom
jcb/rust-kernel
Open

Add DAG-based kernel typechecker#300
johnchandlerburnham wants to merge 15 commits intomainfrom
jcb/rust-kernel

Conversation

@johnchandlerburnham
Copy link
Member

Implement a Lean 4 kernel typechecker using a DAG representation with BUBS (Bottom-Up Beta Substitution) for efficient reduction. The kernel operates on a mutable DAG rather than tree-based expressions, enabling in-place substitution and shared subterm reduction.

12 modules: doubly-linked list, DAG nodes with 10 pointer variants, BUBS upcopy with 12 parent cases, Expr/DAG conversion, universe level operations, WHNF via trail algorithm, definitional equality with lazy delta/proof irrelevance/eta, type inference, and checking for quotients and inductives.

Implement a Lean 4 kernel typechecker using a DAG representation with
BUBS (Bottom-Up Beta Substitution) for efficient reduction. The kernel
operates on a mutable DAG rather than tree-based expressions, enabling
in-place substitution and shared subterm reduction.

12 modules: doubly-linked list, DAG nodes with 10 pointer variants,
BUBS upcopy with 12 parent cases, Expr/DAG conversion, universe level
operations, WHNF via trail algorithm, definitional equality with lazy
delta/proof irrelevance/eta, type inference, and checking for
quotients and inductives.
  Replace the closure-based NbE (Normalization by Evaluation) kernel with
  a direct environment-based approach where types are Exprs throughout.

  - Remove Value/Neutral/ValEnv/SusValue semantic domain (Datatypes.lean)
  - Replace Eval.lean with Whnf.lean (WHNF via structural + delta reduction)
  - Replace Equal.lean with DefEq.lean (staged definitional equality with
    lazy delta reduction guided by ReducibilityHints)
  - Rewrite Infer.lean to operate on Expr types instead of Values
  - Simplify TypecheckM: remove NbE-specific state (evalCacheRef, equalCacheRef),
    add whnf/defEq/infer caches as pure state
  - Add proof irrelevance, eta expansion, structure eta, nat/string literal
    expansion to isDefEq
  - Flatten app spines and binder chains in infer/isDefEq to avoid deep recursion
  Replace HashMap-based eqvCache with union-find EquivManager (ported from
  lean4lean) for congruence-aware structural equality caching. Add inferOnly
  mode that skips argument/let type-checking during inference, used for
  theorem value checking to handle sub-term type mismatches.

  Additional isDefEq improvements:
  - isDefEqUnitLike for non-recursive single-ctor zero-field types
  - isDefEqOffset for Nat.succ chain short-circuiting
  - tryUnfoldProjApp in lazy delta for projection-headed stuck terms
  - cheapProj=true in stage 1 defers full projection reduction to stage 2
  - Failure cache on same-head optimization in lazyDeltaReduction
  - Fix ReducibilityHints.lt' to handle all cases correctly
Move checkStrictPositivity/checkCtorPositivity into the mutual block as
monadic checkPositivity/checkCtorFields/checkNestedCtorFields, enabling
whnf calls during positivity analysis. This matches lean4lean's
checkPositivity and correctly handles nested inductives (e.g. an
inductive appearing as a param of a previously-defined inductive).

Split KernelTests.lean into Helpers, Unit, and Soundness submodules.
Add targeted soundness tests for nested positivity: positive nesting via
Wrap, double nesting, multi-field, multi-param, contravariant rejection,
index-position rejection, non-inductive head, and unsafe outer.

Add Lean.Elab.Term.Do.Code.action as an integration test case requiring
whnf-based nested positivity.
- Rewrite lam/forallE/letE inference to iterate binder chains instead of
  recursing, preventing stack overflow on deeply nested terms
- Add inductBlock/inductNames to RecursorVal to track the major inductive
  separately from rec.all, which can be empty in recursor-only Ixon blocks
- Build InductiveBlockIndex to extract the correct major inductive from
  Ixon recursor types at conversion time
- Fix validateRecursorRules to look up ctors from the major inductive
  directly instead of iterating rec.all
- Fix isDefEq call in lazyDeltaReduction (was calling isDefEqCore)
- Add regression tests for UInt64 isDefEq, recursor-only blocks, and
  deeply nested let chains
Switch all kernel caches from Std.HashMap to Std.TreeMap, replacing
hash-based lookups with structural comparison (Expr.compare, Level.compare).
Expr.compare is fully iterative using an explicit worklist stack, and
Expr.beq/liftLooseBVars/instantiate/substLevels/hasLooseBVarsAbove now
loop over binder chains to avoid stack overflow on deeply nested terms.
Add pointer equality fast paths (ptrEq) for Level and Expr, and a
pointer-address comparator (ptrCompare) for the def-eq failure cache.
…terativize isDefEq

- Replace per-function whnfDepth with unified withRecDepthCheck (limit 2000)
  across isDefEq, infer, and whnf — simpler and more predictable stack
  overflow prevention
- Move cache lookups (inferCache, whnfCache, whnfCoreCache) before
  withFuelCheck/withRecDepthCheck so cache hits incur zero fuel or stack cost
- Iterativize isDefEq main loop: steps 1-5 now loop via continue instead of
  recursing back into isDefEq when whnfCore(cheapProj=false) changes terms
- Iterativize quickIsDefEq lam/forallE binder chains to avoid deep recursion
  on nested binders
- Add pointer equality fast path to Expr.beq; move Expr.ptrEq decl earlier
- Skip context check for closed expressions (const/sort/lit) in inferCache
- Add Expr.nodeCount, trace parameter to typecheckConst
- Add Std.Time.* dependency chain test constants for _sunfold regression
  - Track letValues/numLetBindings in TypecheckCtx so whnfCore can
    zeta-reduce let-bound bvars by looking up stored values
  - Thread let context through iterativized binder chains in infer
    (lam, forallE, letE) and isDefEqCore (lam/pi flattening, eta)
  - Add isProp that checks type_of(type_of(t)) == Sort 0 and rewrite
    isDefEqProofIrrel to use it with withInferOnly
  - Fix K-reduction to apply extra args after major premise
  - Add cheapBetaReduce for let body result types
  - Whnf nat primitive args when they aren't already literals
  - Skip whnf/whnfCore caches when let bindings are in scope
  - Increase maxRecursionDepth to 10000
match b' with
| .zero => .zero
| .succ _ => reduceMax (reduce a) b'
| _ => .imax (reduce a) b'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be | _ => reduceIMax (reduce a) b'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's a bug

- Move tryReduceNat inside mutual block so it can whnf arguments before
  reducing (matching lean4lean's reduceNat), replacing the separate
  whnf-args-and-retry loop in whnf
- Use isDefEqCore (not isDefEq) for nat literal expansion to avoid
  cycle where Nat.succ(lit n) gets reduced back to lit(n+1)
- Return nat reduction results directly in lazyDeltaReduction instead
  of looping back through whnfCore
- Switch def-eq cache keys from pointer-based to content-based
  comparison so cache hits work across pointer-distinct copies
- Consolidate imax reduction: reuse reduceIMax in reduce, instReduce,
  and instBulkReduce; add imax(0,b)=b and imax(1,b)=b rules
- Simplify K-reduction to only fire when major premise is a constructor
- Remove unused depth variable and debug traces
…f into mutual block

  Move whnf/whnfCore/unfoldDefinition from Whnf.lean into the Infer.lean
  mutual block so they can call infer/isDefEq (needed for toCtorWhenK,
  isProp in struct-eta, and checkRecursorRuleType). Add new Primitive.lean
  module that validates Bool/Nat inductives and primitive definitions
  (add/sub/mul/pow/beq/ble/shiftLeft/shiftRight/land/lor/xor/pred/charMk/
  stringOfList) against their expected types and reduction rules. Validate
  Eq and Quot type signatures at quotient init time.

  Key changes:
  - checkRecursorRuleType: builds expected type from recursor + ctor types,
    handles nested inductives (cnp > np) with level/bvar substitution
  - checkElimLevel: validates large elimination for Prop inductives
  - toCtorWhenK: infers major's type and constructs nullary ctor (was stub)
  - tryEtaStruct: now symmetric (tries both directions) with type check
  - isDefEq: add Bool.true proof-by-reflection, fix bvar quick check to
    return none (not false) on mismatch, add eta-struct fallback for apps
  - Safety/universe validation in infer .const, withSafety in checkConst
  - Constructor param domain matching and return type validation
  - Hardcode ~30 new primitive addresses in buildPrimitives
  - Add unit tests for toCtorIfLit/strLitToConstructor/isPrimOp/foldLiterals
  - Add soundness tests for mutual recursors, parametric/nested recursors
  - Previously failing RCasesPatt.rec_1 now passes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants