A canonical synthesis of the constellation work on MAST mode 3.3 ("No or Incorrect Verification") and adjacent agent closeout failure modes — composing runtime gates (
verify-before-stop), text-vocabulary gates (no-vibes), and static-AST gates (no-unreachable-symbol) as defense-in-depth.
Co-authors: Ian Mu (verify-before-stop) · Fernando Lazzarin (llm-dark-patterns, agent-closeout-bench)
License: Apache-2.0 — matches no-vibes and agent-closeout-bench. Vendor-neutral host for upstream MAST-team referencing.
Status: 🚧 Drafting. Chapters 1, 3, 5 first-draft by @ianymu. Chapters 2, 4, 6 stubs for @waitdeadai to fill.
- The fragmented-conversation diagnosis
- MAST 2.6 / 3.1 / 3.2 / 3.3 quick-ref
← TODO Fernando - The Three-Gate Pareto
- Quantitative results
← TODO Fernando - When-to-compose decision tree
- Open problems
← TODO Fernando
The work on MAST mode 3.3 — agent claims completion without performing verification, or fabricates a verification narrative retroactively — is real, well-instrumented, and replicable. The problem is that it lives across six unconnected surfaces, and a new operator hitting the failure mode in production has no canonical entry point.
The six surfaces, as they exist today:
- yurukusa's 10-patterns gist + 130-case handbook — operator-side empirical taxonomy: which closeout phrases co-occur with which downstream failures
- @beq00000's clean-state nav memo gist + 8 authored claude-code issues — heterogeneous failure inventory with per-issue minimal repros
- @suwayama's #60226 anchor — the "recognition-without-arrest" framework name itself, with concrete examples of fabricated comparison tables
- Cemri et al. NeurIPS 2025 MAST paper + Fernando's empirical baseline at
evaluation/MAST-RESULTS.md— the F1 0.815 / Fleiss κ=1.000 measurement on mode 3.3 specifically - Ian's runtime gate:
verify-before-stop— operator-side state machine: filenames-touched ×VERIFIED-log-entry presence as ground truth - Operator-side discussion threads: #45502, #46957, #60451 — running discourse, no synthesis
These six surfaces all describe the same failure mode and propose composable countermeasures, but no document points at all of them at once. A developer who hits MAST 3.3 in production today rediscovers the constellation through pain — they hit the failure, search the issue tracker, find one of the six surfaces, and only after weeks of accumulated context realize the other five exist and connect.
This repo is the synthesis. Not a new framework. Not a new product. A vendor-neutral canonical artifact the upstream MAST team can cite, that both repos can point AT (not OWN), and that new operators land on as the entry point.
(Stub — Fernando to fill. See evaluation/MAST-RESULTS.md for current per-mode kappa table.)
Three independent gates exist today, each catching a different MAST-3.3-adjacent failure shape on a different signal channel. They are not redundant — they triangulate. The composition argument is the load-bearing claim of this repo.
| Gate | Canonical impl | Signal channel | Where it fires |
|---|---|---|---|
verify-before-stop |
ianymu/claude-verify-before-stop |
Operator state — git diff × VERIFIED log entries |
Stop event |
no-vibes |
waitdeadai/llm-dark-patterns/no-vibes |
Closeout text vocabulary — vibe-coding phrase regex with evidence-binary allowlist | Stop event |
no-unreachable-symbol |
waitdeadai/llm-dark-patterns |
Static AST — touched-file symbol-reachability scan | PostToolUse(Edit) |
verify-before-stop catches the case where the agent modifies files but never logs verification — regardless of what the closing message says. The closeout text can be perfectly written ("I implemented the function and confirmed the tests pass"); the gate doesn't care, because the gate reads operator-side ground truth. This catches the quiet competent fabrication — the model that learned that confident terse closeouts evade text gates.
no-vibes catches the case where the closeout uses vibe vocabulary — "looks good", "should work", "I believe", "in theory" — without an evidence binary (a test result, a build log, a literal artifact reference). This catches the fabricated reasoning chain — the model that fills the verification slot with prose that sounds verifying but contains no verifying primitives.
no-unreachable-symbol catches the case where the agent writes code that has no callers — a function declared, exported, but unreferenced. This catches implementation theatre — code written to satisfy the appearance of work, not the spec.
verify-before-stop misses cases where the operator legitimately verified out-of-band (mental verification, external test runner) and forgot to write the VERIFIED log entry. False-positive rate measured at ~4% across ~200 sessions. The gate's strict-contract design intentionally errs toward blocking — recovery is one bash line (echo VERIFIED ... >> .claude/state/stop-verify.log), failure cost of not blocking is high.
no-vibes misses cases where the model doesn't use vibe vocabulary — the terse confident fabrication ("Implementation complete. All tests pass.") that uses no flagged phrases. Per Fernando's empirical baseline: F1 0.815 (95% CI [0.615, 0.941]) on the human-labelled n=19 subset means roughly 18% of MAST 3.3 instances escape the text gate alone.
no-unreachable-symbol misses cases where the agent writes code that IS called but does the wrong thing — a stub that returns hardcoded values, a wrapper that ignores its arguments. AST reachability is a necessary-not-sufficient signal.
The three gates triangulate on signal source, not on MAST mode. Two of them target the same MAST 3.3 surface (verify-before-stop and no-vibes), and a third targets adjacent failure shapes (no-unreachable-symbol against implementation-theatre). They compose because each gate can fail open without the others failing open:
- A session where the model writes good code, logs
VERIFIED, uses vibe vocabulary in the closeout, and leaves no unreachable symbols → all three gates pass (correct: the work was done well). - A session where the model writes good code, logs
VERIFIED, uses no vibe vocabulary, but leaves a dead helper function →no-unreachable-symbolfires (correct: caught the leftover stub). - A session where the model writes good code, forgets to log
VERIFIED, uses no vibe vocabulary, no dead code →verify-before-stopfires (correct: operator-side ground truth absent). - A session where the model writes nothing, logs nothing, but writes a confident fabricated closeout with vibe vocabulary →
no-vibesfires andverify-before-stopfires (correct: both signals agree).
The disagreement cases — where one gate fires and another passes — are the interesting empirical surface. Those are where the parity test in the synthetic-3.1 corpus PR (waitdeadai/agent-closeout-bench#12) produces the per-fixture disagreement table that exposes which evidence stream catches what the other misses.
(Stub — Fernando to fill. Current numbers live at waitdeadai/llm-dark-patterns/evaluation/MAST-RESULTS.md: F1 0.815 [0.615, 0.941] on n=19, Fleiss κ=1.000 on mode 3.3. Parity baseline from PR #12: verify-before-stop F1=0.77, no-vibes F1=0.89, Cohen κ=0.49 on 20 synthetic 3.1 fixtures.)
The synthesis is only useful if a developer hitting one of these failure shapes can land here, identify which gate(s) apply, and wire them in 10 minutes. This section is the diagnostic walkthrough.
Start with the failure you observed. Match it to the row below.
| Observed symptom | Primary gate | Add if also seeing |
|---|---|---|
| Agent says "all tests pass" / "tests added" / "implementation complete" → you check, no tests ran | verify-before-stop |
no-vibes for the linguistic surface |
| Agent's closeout reads confident but contains "looks good", "should work", "I believe" | no-vibes |
verify-before-stop for operator-side ground truth |
| Agent leaves declared functions with no callers, exports that nothing imports | no-unreachable-symbol |
Run on PostToolUse(Edit); blocks dead-code accrual |
| Agent claims to have "verified" something but you can't find the verification artifact | verify-before-stop |
no-vibes if the claim used vibe vocabulary |
| Multi-file refactor where some files are dirty post-claim | verify-before-stop (operator state) + no-unreachable-symbol (AST reachability) |
— |
| Closeout uses wrap-up vocabulary ("to summarize", "in conclusion", "hope this helps") while files are dirty | verify-before-stop + no-vibes |
This is MAST mode 3.1 territory — see PR #12 synthetic corpus for fixtures |
For users wiring multiple gates: cheapest-first is correct.
PreToolUse(Bash) → cheap regex on the command itself (catches `rm -rf /` etc)
PostToolUse(Edit) → no-unreachable-symbol (AST scan on touched files only)
Stop → verify-before-stop (operator-side state machine)
Stop → no-vibes (text vocabulary regex with evidence-binary allowlist)
The two Stop gates can be wired in either order — Claude Code runs them sequentially. If verify-before-stop fires first (exit 2), the session ends there; if it passes, no-vibes runs against the closeout text.
{
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/hooks/no-unreachable-symbol.sh"
}
]
}
],
"Stop": [
{
"matcher": "*",
"hooks": [
{
"type": "command",
"command": "bash ~/.claude/hooks/verify-before-stop.sh"
},
{
"type": "command",
"command": "bash ~/.claude/hooks/no-vibes.sh"
}
]
}
]
}
}Install all three:
# verify-before-stop (Ian)
curl -fsSL https://raw.githubusercontent.com/ianymu/claude-verify-before-stop/main/install.sh | bash
# no-vibes (Fernando)
curl -fsSL https://raw.githubusercontent.com/waitdeadai/no-vibes/main/install.sh | bash
# no-unreachable-symbol (Fernando)
curl -fsSL https://raw.githubusercontent.com/waitdeadai/llm-dark-patterns/main/install/no-unreachable-symbol.sh | bashThe gates fail closed (return exit 2). If a gate misfires on a legitimately-verified session:
verify-before-stop→echo "VERIFIED: <files-list> <timestamp>" >> .claude/state/stop-verify.logand re-run the agentno-vibes→ either reword the closeout (drop the vibe vocabulary) or setLDP_NO_VIBES_OFF=1for known-safe contexts (don't habituate)no-unreachable-symbol→ either delete the unreachable symbol or setLDP_UNREACHABLE_SYMBOL_BLOCK=0for advisory-only mode
(Stub — Fernando to fill. Synthetic 3.1 corpus partial fix landing in agent-closeout-bench#12; 2.6 measurement gap and agent-side ground truth still open.)
Two co-maintainers: @ianymu and @waitdeadai. PRs welcome; issues welcome. The aim is canonical synthesis — additions should either fill a stub section, add an empirical data point, or correct an error. Marketing-shaped contributions will be politely declined.
For substantive disagreements: open an issue with the empirical case (closeout text + operator state + which gates fired). Empirics > opinions.
- This repo's structure proposal: waitdeadai's reply on anthropics/claude-code#46957 (Fernando), 2026-05-21
- Repo name proposal: same comment, framework-anchor matching @suwayama #60226
- Chapters 1 / 3 / 5: first-draft by @ianymu, 2026-05-21
- Chapters 2 / 4 / 6: stubs for @waitdeadai to fill