Skip to content

design: no-unreachable-symbol — Stop-boundary reachability sibling to no-vibes (prompted by @ianymu sketch on anthropics/claude-code#60451) #23

@waitdeadai

Description

@waitdeadai

Motivation

rkpandey's anthropics/claude-code#60451 ("Claude claims implementation is complete while leaving dead code with no callers") describes a failure mode this suite does not yet cover directly: the model writes a new public function, claims completion in the closing message, but the function has zero callers anywhere in the codebase. The integration step that would make the function reachable is silently skipped.

@beq00000's clean-state evidence on #60226"seven instances of recognition-without-arrest in a non-drifted session, all caught externally; the pattern is the default mode, not the drift mode" — frames this as Stage 3 (non-gating) at the code boundary, not just at the text boundary. no-vibes (this suite's flagship) gates Stage 3 at the closing-message text level; it may catch this failure mode incidentally if the closeout uses positive verbs without evidence, but it is not the right tool — text vocabulary parsing cannot detect code reachability.

Attribution

The detection mechanism was sketched by @ianymu in anthropics/claude-code#60451#issuecomment-4495901564 as an extension to his verify-before-stop (MIT). Quoted bash sketch:

# After files changed, find new public functions
NEW_FNS=$(git diff --unified=0 | grep -E '^\+(def |export function |public )' | ...)
for fn in $NEW_FNS; do
  if ! grep -r "$fn" --include='*.py' --include='*.ts' --exclude='*test*' . | grep -v "$fn *("; then
    echo "⛔ New function $fn has zero callers — claim of 'complete' is unverified" >&2
    exit 2
  fi
done

This issue tracks designing a hardened sibling hook (no-unreachable-symbol) for this suite that absorbs the idea with explicit credit, applies the scientific framework documented in docs/methodology/fixture-driven-iteration.md, and addresses the false-positive surface a naive port would leave open.

Failure mode (precise definition)

Stop event boundary. Files changed in this turn introduce one or more new publicly-accessible symbols in any of: Python def , TypeScript/JavaScript export function / export class, Rust pub fn / pub struct, Go capital-letter exports, Ruby def , Java public . For each newly-introduced symbol, the codebase (excluding test files) contains zero invocations or references that resolve to that symbol.

The model's closing message may or may not claim completion. The hook fires on the codebase state regardless of closeout text — this is the independence-of-signal property that distinguishes it from no-vibes.

False-positive surface (must be addressed before strict mode)

A naive grep-for-callers approach fires false positives on:

  1. Entry-point functions called by frameworks: web routes (@app.route("/foo"), FastAPI handlers, Django views), CLI commands (@click.command, argparse subcommands, typer apps), event handlers, signal handlers, callback registrations (atexit.register(...), addEventListener), React/Vue/Svelte component exports, Next.js page exports. These have zero in-repo callers by design — the framework invokes them.

  2. Plugin/registry patterns: functions added to a module-level registry via HANDLERS["foo"] = my_foo_handler or register("foo", my_handler). Grep finds the function definition but not a my_handler( invocation site.

  3. Dynamic dispatch: getattr(module, name), globals()[name], JavaScript bracket access obj["method"], Python's __getattr__. The invocation is hidden behind a string lookup.

  4. Decorator-based dispatch: any @some_decorator that wires the function into a framework — the function never appears at a literal call site, only its decorated form is invoked by the framework.

  5. Public library API: a function intended to be called by external library consumers. No in-repo callers exist by design.

  6. Subclass overrides: a new method on a subclass invoked via base-class polymorphism. Grep may not find a direct cls.method( call if dispatch is instance.method() on a polymorphic variable.

  7. Test scaffolding / migration code: a function written in preparation for a follow-up PR's wiring. Strict reading is a false positive; intent reading is partially true.

  8. Async/handler delegation patterns: callbacks passed as values rather than invoked at a call site (run_in_executor(my_fn), setTimeout(my_fn, 0)).

The honest read: the false-positive surface is broader than the true-positive surface in unbounded grep mode. Strict default would be unusable. Hook must default to advisory mode (advisory stderr + exit 0) with strict mode (exit 2) opt-in only.

Design choices

  • Advisory by default: stderr warning, exit 0. Opt-in to strict mode via LDP_UNREACHABLE_SYMBOL_BLOCK=1 (matches no-vibes / sql-bulk-delete-warn env-var convention).
  • Per-language frontmatter: separate symbol-extraction regex and exclusion patterns per *.py, *.ts / *.tsx / *.js, *.rs, *.go. Hook config picks language by file extension of the new symbol's defining file.
  • Built-in decorator exclusion list: @app.route, @router.get, @router.post, @click.command, @pytest.fixture, @app.get, @app.post, @cli.command, @router.*, @FastAPI(...), plus framework-specific patterns. Configurable via .no-unreachable-symbol.toml or similar.
  • Built-in path exclusion globs: **/routes/**, **/handlers/**, **/cli/**, **/views/**, **/pages/**, **/api/** (configurable).
  • Registry-pattern detection: scan for register\(["'<symbol>["'] and HANDLERS\[["'<symbol>["'] patterns; treat as evidence the symbol is wired.
  • Subclass-override heuristic: if the symbol name matches a method in any base class in the same module, treat as polymorphic and skip the gate.
  • Public-API marker: respect __all__ in Python modules and export annotations in tsconfig; if the symbol is in the public API surface, downgrade to "no in-repo caller; library API assumed" message rather than block.

Composition with existing tooling

Three independent signals at the Stop boundary:

Hook Signal source Operator effort Failure shape it catches
no-vibes (this suite) Closing-message text vocabulary passive positive closeout verb + no proximate evidence in text
claude-verify-before-stop (@ianymu) External VERIFIED log file active write model fabricates verification narrative without log entry
no-unreachable-symbol (proposed) Git diff + codebase grep passive new public symbol with zero callers

Each catches a different sub-failure of Stage 3 (non-gating). Stack composes without overlap.

Empirical baseline (honest limit)

The MAST/MAD dataset is multi-agent text trajectories. It has no analogue for "git diff + codebase reachability" ground truth. This hook ships without an F1 baseline.

What it ships with instead, per docs/methodology/fixture-driven-iteration.md:

  • Per-language positive fixtures: new symbol with zero callers (hook should fire)
  • Per-language negative fixtures: properly wired (hook should not fire)
  • Per-language edge fixtures: decorator-wired, registry-wired, dynamic dispatch, subclass override, public API marker, framework callback (hook should not fire on any of these)

The fixture suite is the contract the regex is responsible to. Strict mode opt-in is gated on the fixture suite passing across all configured languages.

A future direction: build an empirical baseline by mining anthropics/claude-code issues + PR bodies for "dead code with no callers" reports and labelling diffs. Out of scope for this issue.

Implementation plan (sliced)

  • Slice 0 — Python only, advisory mode, built-in decorator exclusion, __all__ respect. Fixture suite: 8-12 positives, 8-12 negatives, 6-8 edges. Ship when all pass.
  • Slice 1 — TypeScript / JavaScript. Same shape; framework patterns for Express, FastAPI-equivalents (Next.js handlers), React components.
  • Slice 2 — Rust + Go. pub fn / capital exports. Trait dispatch and impl blocks add complexity.
  • Slice 3 — Strict mode opt-in. Configurable .no-unreachable-symbol.toml. Document the operator-side cost of strict mode.

Acceptance criteria for the first slice

  • Hook file at hooks/no-unreachable-symbol.sh per repo convention
  • Test fixtures at tests/stress/no-unreachable-symbol/{positive,negative,edge}/
  • All fixtures pass under bash tests/stress/run.sh --hook no-unreachable-symbol
  • README entry under "What's shipped" with honest scope (advisory; per-language coverage; false-positive limitations documented)
  • Attribution to @ianymu's sketch and verify-before-stop repo in the hook file's comment header
  • Composition note linking no-vibes and verify-before-stop as siblings in the same Stage 3 family

Open questions

  • Should the hook also consider new public classes with no instantiation sites, not just functions? Same false-positive surface (data classes, framework-instantiated handlers); same advisory framing applies.
  • Cross-language project handling: how does the hook behave when the new symbol is in Python but a TypeScript file might call it via cross-language IPC? Treat as out-of-scope for v1.
  • Should we attempt an AST-level reachability pass (ast.parse in Python, ts-morph or similar) for higher precision? Tradeoff: precision vs runtime cost vs language coverage. v1 stays grep-based; AST-level is a future slice.

cc: @ianymu for visibility on the design absorption with credit; happy to coordinate if the sketch direction has more nuance than what we captured here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions