feat(hooks): no-count-drift — count-vs-enumeration self-consistency gate (MAST FM-3.2)#27
Conversation
…cy gate A deterministic Stop/SubagentStop hook that blocks a count stated in a message when it contradicts the message's own enumeration or arithmetic (e.g. "six findings:" + a 5-item list; "9/10 = 80%"). Self-consistency / MAST FM-3.2 axis, orthogonal to no-fake-stats (citation presence). Counting lives in pure-stdlib Python (lib/count_drift.py) because counting is a symbolic strength and an LLM weakness. Three detectors (fraction/percent recompute, "N of M" bound, headline-vs-single-enumeration), each abstaining on ambiguous scope. High-precision blocking gate; fail-open without jq/python3. Verified: precision 1.000 / 0 false positives on 15 adversarial negatives; harness 9/9. F1=1.0 on the hand-authored corpus is a co-evolved-corpus number (caveated in RESULTS.md), not a field-generalization claim. Proposed by @beq00000 (Brendan Quinn) on recognition-without-arrest-corpus#9. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ndent MAD eval
Testing count_drift over corpora it was NOT authored against — 660 real LLM
responses in evaluation/raw_results.jsonl + 328 stress fixtures for the other
hooks — surfaced 17 false positives the hand-authored (co-evolved) fixtures
could not see:
1. R3 lead-in was too loose: "...favor one side. Instead:" / "one of four
quadrants:" matched because a number+noun merely co-occurred with a
sentence-colon on the line. Fix: colon must be adjacent to the noun phrase
(no intervening punctuation/number), count >= 2, lists only (a 2x2 table
has 4 cells but 2 rows).
2. Number words lacked a leading \b, so "of-ten"/"writ-ten" parsed as "ten".
Fix: \b before the number in the lead-in.
Independent false-positive rate now 0 / 988 texts. The two FP classes are locked
in as regression negatives in fixtures.jsonl. Adds evaluation/v6/independent_eval.py
(reproducible non-circular check) and folds its result into RESULTS.md, so the
load-bearing precision number is the independent one, not the hand-authored F1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Follow-up before merge: ran an independent, non-circular precision check. Tested
Both fixed in |
no-count-drift— count-vs-enumeration self-consistency gateA deterministic Stop/SubagentStop hook that blocks a count stated in a message when it contradicts the message's own enumeration or arithmetic. Proposed by @beq00000 (Brendan Quinn) on
recognition-without-arrest-corpus#9— "a final-pass diff between every count-claim in prose and its enumeration or table source," a verification gate that lives outside the writing agent's recall.Why it's distinct from
no-fake-statsOrthogonal axes (the factuality-vs-faithfulness split, HalluLens ACL 2025):
no-fake-stats= factuality: a precise number lacks a citation. Ignores small integers by design.no-count-drift= self-consistency / faithfulness (MAST FM-3.2 "no or incomplete verification"): a stated count contradicts the artifact's own content. A citation cannot repair an internal mismatch, and the common case ("six findings:" then five bullets; "all 5 tests pass" then four listed) uses the small integersno-fake-statsskips.Landscape check (deepresearch): no existing hook in
llm-dark-patterns/agent-closeout-bench/cc-safe-setupdoes in-message count-vs-enumeration. This fills a real gap, not a duplicate.Design — deterministic, abstain-on-ambiguity
Counting lives in pure-stdlib Python (
lib/count_drift.py, no deps) because counting is a rule-based-symbolic strength and an LLM weakness whose errors are self-consistent on resample ("Sequential Enumeration in LLMs"; "Too Consistent to Detect"). Three detectors, each abstaining when scope is ambiguous:9/10 = 80%→ blocked (it's 90%);2/35 = 5.7%passes (correct rounding).5 of 3→ blocked.It is a blocking gate, so it fires only on unambiguous self-contained mismatches and otherwise passes (fail-open without jq/python3).
Verification
tests/test-count-drift.sh→ 9/9 PASS (block/pass/abstain, fail-open, re-entrancy, determinism).hooks/hooks.jsonvalid; hook fires end-to-end (exit 2) viaCLAUDE_PLUGIN_ROOT.Honesty caveat (in
evaluation/v6/RESULTS.md)F1 = 1.000 here is a co-evolved-corpus number — same author wrote the detector and the fixtures — not a field-generalization claim, and would inflate if cited as such. The load-bearing, generalizable metric is precision / zero-false-positives on the adversarial negatives. Per the statcheck precedent (deterministic internal-consistency check: ~96–100% specificity, ~61% recall in the wild), real-world recall will be far below 1.0, bounded by structural-extraction coverage. That trade is intentional: abstain rather than false-fire.
Files
lib/count_drift.py,hooks/no-count-drift.sh,evaluation/v6/{SPEC.md,RESULTS.md,fixtures.jsonl,score_count_drift.py},tests/test-count-drift.sh; wired intohooks/hooks.json(Stop + SubagentStop);README.mdMAST table + catalog updated.🤖 Generated with Claude Code