diff --git a/.gitignore b/.gitignore
index dd2f776..f97094a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -8,3 +8,5 @@
 # Stress runner output — regenerate locally with `bash tests/stress/run.sh`.
 # CI uploads it as a workflow artifact; no need to commit.
 tests/stress/STRESS-REPORT.md
+__pycache__/
+*.pyc
diff --git a/README.md b/README.md
index 857eaa3..be8fc53 100644
--- a/README.md
+++ b/README.md
@@ -94,6 +94,7 @@ The following 9 hooks conceptually target their MAST mode but did not produce me
 | `no-cliffhanger` | 1.5 Unaware of Termination Conditions, 3.1 Premature Termination | `zone: tail` (last 520 chars) is the trajectory tail, not a closeout sentence |
 | `no-aggregator-hallucination`, `no-fake-stats` | 2.6 Action-Reasoning Mismatch | Tuned for supervisor closeouts; synthesis claim buried in trajectory chatter |
 | `no-cherry-pick-rollup`, `no-silent-worker-success`, `no-sandbagging-disguise` | 3.1 / 3.2 Verification failures | Calibrated for supervisor reports, not multi-turn collaboration text |
+| `no-count-drift` | 3.2 No or Incomplete Verification (self-consistency) | Stated count vs the message's own enumeration/arithmetic; deterministic, abstain-on-ambiguity. Proposed by @beq00000 on `recognition-without-arrest-corpus#9` |
 
 The methodology gap is structural: hooks are tuned for individual Claude Code closeout messages; MAD's text is full multi-agent trajectory. Per-message scanning is the planned next experiment ([`MAST-RESULTS.md` §"Next steps"](evaluation/MAST-RESULTS.md)).
 
@@ -114,7 +115,7 @@ Also outside MAD's text-only scope but conceptually a Stage 3 (non-gating) failu
 The active catalog is organized in six branches by mechanism:
 
 - **Interaction-style** (8): catch *how* the model talks. `no-vibes`, `time-anchor`, `no-curfew`, `no-sycophancy`, `no-cliffhanger`, `no-wrap-up`, `no-tldr-bait`, `honest-eta`.
-- **Fact-fabrication** (5): catch *what* the model claims. `no-fake-recall`, `no-fake-stats`, `no-fake-cite`, `no-phantom-tool-call`, `no-rollback-claim-without-evidence`.
+- **Fact-fabrication** (6): catch *what* the model claims. `no-fake-recall`, `no-fake-stats`, `no-fake-cite`, `no-phantom-tool-call`, `no-rollback-claim-without-evidence`, `no-count-drift` (self-consistency: a stated count vs the message's own enumeration — orthogonal to `no-fake-stats`, which is citation-presence).
 - **Continuity** (1): counter context loss rather than block dishonest output. `no-amnesia`.
 - **Multi-agent orchestration** (5): catch supervisor / +N-parallel-instance failure modes. `no-aggregator-hallucination`, `no-silent-worker-success`, `no-cherry-pick-rollup`, `no-ownership-violation`, `no-handoff-loop`.
 - **Agentic safety** (3): catch credential leak, sandbagging disguise, approval-sneak surfaces. `no-credential-leak-in-handoff`, `no-sandbagging-disguise`, `no-approval-sneak`.
diff --git a/evaluation/v6/RESULTS.md b/evaluation/v6/RESULTS.md
new file mode 100644
index 0000000..7d5e5e4
--- /dev/null
+++ b/evaluation/v6/RESULTS.md
@@ -0,0 +1,33 @@
+# v6 count-drift — RESULTS
+
+Scorer: `evaluation/v6/score_count_drift.py` over `fixtures.jsonl` (28 fixtures: 9 positive / 19 adversarial negative).
+
+| metric | value |
+|---|---|
+| precision | 1.000 |
+| recall | 1.000 |
+| F1 | 1.000 |
+| F1 95% CI (bootstrap, n=1000, seed=42) | [1.000, 1.000] |
+| true positives | 9 |
+| **false positives** | **0** |
+| misses | 0 |
+
+SC1 (zero false positives on the adversarial negative set): PASS
+
+## Independent evaluation (non-circular)
+
+Detector run over corpora it was NOT authored against — real LLM `model_response`/`prompt_text` from `evaluation/raw_results.jsonl` and the stress fixtures authored for the *other* hooks. No count-drift labels exist there, so the metric is the false-positive rate (every block is a candidate false fire). Reproduce: `python3 evaluation/v6/independent_eval.py`.
+
+| corpus | texts | blocks |
+|---|---|---|
+| MAD raw_results | 660 | 0 |
+| stress fixtures (other hooks) | 328 | 0 |
+| **total** | **988** | **0** |
+
+False-positive rate on independent text: **0.0000**. This is the load-bearing, non-circular precision evidence — distinct from the hand-authored F1 below. (Two real false positives found during development — a too-loose lead-in and a missing word-boundary on number words — were fixed and locked in as regression negatives.)
+
+## Honesty caveat (read before citing F1)
+
+This corpus is **hand-authored** — the same author wrote the detector and the fixtures — so an F1 of 1.0 here is **not** a wild-generalization claim; it is a co-evolved-corpus number and would inflate if cited as field performance. What the number legitimately shows: the detector behaves to spec on the designed cases, **including the adversarial negatives authored to break it** (nested-colon lead-ins, section-index numbers, label words, approximation markers, ambiguous multi-list scope, nested-list depth). The load-bearing, generalizable metric is **precision / zero-false-positives on those adversarial negatives** — the property a blocking gate must hold.
+
+Recall is reported, not gated. Per the statcheck precedent (deterministic internal-consistency check: ~96-100%% specificity but only ~61%% recall in the wild), real-world recall here will be far below 1.0, bounded by structural extraction coverage. That trade is intentional: abstain rather than false-fire.
diff --git a/evaluation/v6/SPEC.md b/evaluation/v6/SPEC.md
new file mode 100644
index 0000000..5669b0f
--- /dev/null
+++ b/evaluation/v6/SPEC.md
@@ -0,0 +1,196 @@
+# SPEC — v6: `no-count-drift` Stop hook (count-vs-enumeration self-consistency gate)
+
+Status: ACTIVE (pre-implementation). Author: waitdeadai. Date: 2026-05-25.
+Origin: proposed by @beq00000 (Brendan Quinn) on `recognition-without-arrest-corpus#9` —
+"a final-pass diff between every count-claim in prose and its enumeration or table
+source," a verification gate that lives *outside* the writing agent's recall.
+
+## 1. Problem Statement
+
+LLM agents state a count in prose ("fourteen instances", "six instances", "5 of 7",
+"2/35 = 5.7%") that contradicts the artifact's own enumeration or arithmetic — a
+self-consistency (faithfulness) failure, not a missing-citation (factuality) failure.
+No existing hook in the suite catches it: `no-fake-stats` checks citation presence and
+deliberately ignores small integers.
+
+## 2. Success Criteria (measurable)
+
+- **SC1 (precision floor, hard gate):** On the v6 fixture set, the deterministic core
+  produces **zero false positives on the negative set** (precision = 1.000). A blocking
+  gate must not false-fire. The negative set MUST be **adversarial and authored
+  independently of the detector patterns** (correct counts that resemble the positives;
+  ambiguous/multi-enumeration scope; nested lists a naive counter would miscount; vague
+  cardinality) — precision claimed only against that adversarial set, to avoid the
+  co-evolved-corpus fake-precision trap. Verified by `tests/test-count-drift.sh` exiting 0.
+- **SC2 (seeded true positives caught):** Each present as a fixture, each yields
+  `decision=block`: (a) "fourteen" prose vs 15 enumerated bullets (R3); (b) "six
+  instances" headline vs five enumerated (R3); (c) a genuinely wrong fraction-to-percent,
+  e.g. "9/10 = 80%" (R1; 9/10 is 90%). NOTE: the #9 cross-section case ("2/35=5.7%" in §2
+  vs "2/42=4.8%" in §3) is the same-quantity-different-denominator linking case and is
+  explicitly OUT of scope (§3, advisory only); "2/35 = 5.7%" is itself arithmetically
+  CORRECT (5.71% rounds to 5.7%) and MUST pass — it belongs in the adversarial negatives.
+- **SC3 (abstention on ambiguity):** Inputs with 0 or ≥2 candidate enumerations in
+  scope, vague cardinality ("a few"), or prose-only counts yield `decision=pass`
+  (abstain). Verified by negative fixtures.
+- **SC4 (fail-open):** Missing `jq` OR missing `python3` → hook exits 0 (never breaks a
+  session). Verified by `tests/test-count-drift.sh` env-stubbed cases.
+- **SC5 (determinism):** Identical input → identical verdict across two runs, zero delta.
+  Verified by running the fixture scorer twice and diffing.
+- **SC6 (no regression):** `bash scripts/*hook-smoke*` (or the repo's hook smoke) still
+  passes with `no-count-drift` wired into `hooks/hooks.json`.
+- **SC7 (reported, not gated):** F1 + bootstrap 95% CI on the full fixture set, reported
+  in `evaluation/v6/RESULTS.md`. Recall is reported, NOT required high (bounded by
+  structural extraction, per statcheck precedent ~61% recall at >96% specificity).
+
+Non-criteria (explicitly): high recall is NOT a success criterion. LLM-judge accuracy is
+out of scope for v1.
+
+## 3. Scope
+
+**In scope:**
+- `hooks/no-count-drift.sh` — bash wrapper (reads `.last_assistant_message`, fail-open,
+  calls the Python core, exit 2 on block) matching the suite's hook conventions.
+- `lib/count_drift.py` — pure-stdlib Python core (no pip deps). Three TIER-1 deterministic
+  detectors with strict abstention.
+- `evaluation/v6/fixtures.jsonl` — hand-authored positives + negatives (incl. the SC2
+  seeds and SC3 abstention cases).
+- `evaluation/v6/score_count_drift.py` — scorer (precision/recall/F1 + bootstrap CI).
+- `tests/test-count-drift.sh` — bash harness (assert pattern from `test-pack-loader.sh`).
+- `hooks/hooks.json` — wire `no-count-drift.sh` into `Stop` and `SubagentStop`.
+- `evaluation/v6/RESULTS.md` + this `SPEC.md`.
+
+**Out of scope (this PR):**
+- LLM-judge advisory tier (optional follow-up `no-count-drift-warn.sh`, env-gated, never
+  exits 2 — mirrors existing `no-sycophancy-warn.sh`). Reason: keep v1 deterministic and
+  high-precision; the literature ("Too Consistent to Detect", 2025-05) shows LLM judges
+  miss self-consistent count errors.
+- Cross-section semantic same-quantity linking (prose number vs a table cell for the
+  "same" labeled metric) — the KPI-Check problem, ~73% F1; too fuzzy to block on.
+- A Rust YAML rule pack in `agent-closeout-bench` — the engine is regex-match-only and
+  cannot count items or recompute fractions, so the computation must live in Python. N/A.
+
+## 4. Detector design (TIER-1 deterministic, abstain-on-ambiguity)
+
+Input: the assistant message text. Output JSON: `{decision: block|pass, rule, evidence}`.
+
+- **R1 — fraction/percentage arithmetic self-check (safest, ~100% precision, no linking):**
+  Find `A/B = P%` / `A/B (P%)` patterns; recompute `A/B*100`; `block` if
+  `|computed − P| > tol` (tol = max(0.5pp, one-ulp of stated precision)). No enumeration
+  needed; pure arithmetic.
+- **R2 — "N of M" bound check:** parse `N of M <noun>` (word or digit); `block` if `N > M`
+  (impossible). If exactly one enumeration of the noun is in scope with a counted size,
+  also check N against it; else just the N>M bound.
+- **R3 — headline count vs single enumeration:** a count-claim (`<num> <noun>` as a
+  heading/lead-in, **count ≥ 2**, colon **adjacent to the noun phrase** with no
+  intervening punctuation/number) immediately followed, before the next heading, by
+  **exactly one markdown LIST** whose **top-level** item count ≠ the claimed number.
+  Tables are excluded (a 2×2 matrix has 4 cells but 2 rows — a poor count proxy). Number
+  words require a leading word boundary (so "of-**ten**" / "writ-**ten**" do not parse as
+  "ten"). **Abstain (pass)** on count < 2, 0 or ≥2 candidate lists, depth ambiguity,
+  label/section indices, second-number lead-ins, or vague cardinality. (R3's loose
+  lead-in and word-boundary gaps were caught by the independent MAD eval — §7 — not the
+  hand-authored fixtures.)
+
+Number parsing: built-in word→int lexicon (one..nineteen, tens, hundred, thousand,
+ordinals, "a dozen"=12); no `text2num`/pip dep. Markdown counting: count top-level list
+items (min-indent of the contiguous block) and table data rows (exclude header+separator)
+via stdlib line scanning. Every detector **abstains** rather than guesses.
+
+## 5. Agent-Native Estimate
+
+- Estimate type: agent-native wall-clock.
+- Execution topology: local (the precision-critical parser/abstention logic is one tightly
+  coupled reasoning loop; do not split). Fixtures are a small optional sidecar.
+- Capacity evidence: capacity is NOT the binding constraint — this is local single-loop
+  implementation, not parallelizable dense work; `parallel-capacity.sh` not consulted
+  because lanes don't reduce the critical path here.
+- Effective lanes: 1 (optionally 2 if fixtures are delegated).
+- Critical path: SPEC → /specqa → /introspect → `count_drift.py` core → fixtures →
+  scorer/tests → /verify.
+- Agent wall-clock: optimistic ~6 build/verify cycles, likely ~10, pessimistic ~16
+  (pessimistic if hitting SC1 zero-false-positive needs a precision-tuning iteration on R3).
+- Agent-hours: low (single-file core + fixtures + harness).
+- Human touch time: design direction (given), review of the diff, merge/PR decision.
+  No external credentials.
+- Calendar blockers: none — isolated worktree, feature branch, no deploy. (Pushing the
+  branch later needs the `workflow` scope only if hooks.json wiring counted as a workflow
+  file — it is NOT under `.github/workflows`, so no scope blocker.)
+- Confidence: medium — downgrade reason: R3 referent-linking abstention may need one
+  tuning pass to guarantee zero false positives (SC1) without collapsing R3 recall to zero.
+- Human-equivalent baseline (secondary only): ~half a day for a developer to write the
+  parser, fixtures, and tests carefully.
+
+## 6. Implementation Plan
+
+### Task 1: Python core `lib/count_drift.py`
+Definition of Done:
+- [ ] Reads message text from stdin or `--text`/`--file`; emits decision JSON.
+- [ ] R1 fraction/percentage recompute with tolerance.
+- [ ] R2 "N of M" bound + optional in-scope list check.
+- [ ] R3 headline-count vs single-enumeration with strict abstention.
+- [ ] Word→int lexicon; top-level markdown list + table-row counting (stdlib only).
+- [ ] Abstains (pass) on every ambiguous case enumerated in §4.
+
+### Task 2: Bash hook `hooks/no-count-drift.sh`
+Definition of Done:
+- [ ] Reads stdin JSON, extracts `.last_assistant_message` via jq.
+- [ ] Fail-open (exit 0) if jq or python3 absent, or input not JSON.
+- [ ] Calls `lib/count_drift.py`; on `decision=block` exit 2 with
+      `BLOCKED:` + `Matched rule:` + `Evidence:` + `Repair guidance:` (suite format).
+- [ ] `stop_hook_active=true` → exit 0 (no re-entrancy), matching siblings.
+
+### Task 3: Fixtures + scorer + tests
+Definition of Done:
+- [ ] `evaluation/v6/fixtures.jsonl` with the SC2 positives, SC3 abstentions, and a
+      balanced negative set (correct counts that must pass).
+- [ ] `evaluation/v6/score_count_drift.py` → precision/recall/F1 + bootstrap 95% CI (seed
+      fixed, samples=1000) writing `evaluation/v6/RESULTS.md`.
+- [ ] `tests/test-count-drift.sh` asserts SC1 (0 FP), SC2 (seeds blocked), SC3 (abstain),
+      SC4 (fail-open via PATH stub), SC5 (determinism).
+
+### Task 4: Wiring + docs
+Definition of Done:
+- [ ] `hooks/hooks.json` adds `no-count-drift.sh` to `Stop` and `SubagentStop`.
+- [ ] `evaluation/v6/RESULTS.md` populated from a real scorer run.
+- [ ] README hook table / METHODOLOGY updated to list the hook + its MAST FM-3.2 mapping.
+
+## 7. Verification
+
+- SC1/SC2/SC3/SC4/SC5 → `bash tests/test-count-drift.sh` exits 0 (each assertion).
+- SC6 → repo hook-smoke passes with the new wiring.
+- SC7 → `python3 evaluation/v6/score_count_drift.py` writes RESULTS.md with F1 + CI;
+  run twice, diff = empty (also covers SC5).
+- Hook-level smoke: pipe a positive `{"hook_event_name":"Stop","last_assistant_message":...}`
+  → exit 2; negative → exit 0; `PATH` without jq → exit 0.
+- **Independent (non-circular) precision** → `python3 evaluation/v6/independent_eval.py`
+  runs the detector over corpora it was NOT authored against (real LLM responses in
+  `evaluation/raw_results.jsonl` + the other hooks' stress fixtures) and must report
+  **0 blocks** (zero false positives). Result: 0 / 988 texts. This is the precision
+  evidence that the hand-authored F1 cannot give (co-evolved-corpus); it caught two real
+  R3 false-positive classes that the fixtures missed.
+
+## 8. Rollback Plan
+
+1. The work is isolated on branch `feature/v6-count-drift` in a worktree; nothing is on
+   `main` until an explicit merge.
+2. To unwire without removing files: delete the `no-count-drift.sh` entries from
+   `hooks/hooks.json` (each hook is independent by design).
+3. Full revert: `git branch -D feature/v6-count-drift` (pre-merge) or `git revert <sha>`
+   (post-merge); remove `hooks/no-count-drift.sh`, `lib/count_drift.py`, `evaluation/v6/`.
+4. Verify rollback: repo hook-smoke passes and `grep -c no-count-drift hooks/hooks.json` = 0.
+
+## Source ledger (deepresearch, accessed 2026-05-25)
+
+- statcheck (CRAN; Nuijten validity study 2017): deterministic recompute-and-compare,
+  specificity 96–100%, recall ~61% — high-precision/narrow-recall precedent.
+- ContraDoc (NAACL 2024, arXiv:2311.09182): "Numeric" is a named and EASIEST
+  intra-document self-contradiction type.
+- "Sequential Enumeration in LLMs" (arXiv:2512.04727) + "Too Consistent to Detect"
+  (arXiv:2505.17656): counting is a rule-based-symbolic strength; LLM/self-consistency
+  judges miss self-consistent count errors → counting belongs in deterministic code.
+- HalluLens (ACL 2025, arXiv:2504.17550): this is intrinsic/faithfulness hallucination,
+  orthogonal to factuality.
+- MAST (NeurIPS 2025 D&B, arXiv:2503.13657): maps to FM-3.2 "No or incomplete
+  verification" (+ FM-2.6). DarkBench (ICLR 2025, arXiv:2503.10728) does not cover it.
+- Landscape check: no existing hook in `llm-dark-patterns` / `agent-closeout-bench` /
+  `cc-safe-setup` does in-message count-vs-enumeration — confirmed gap, not duplication.
diff --git a/evaluation/v6/fixtures.jsonl b/evaluation/v6/fixtures.jsonl
new file mode 100644
index 0000000..3abd817
--- /dev/null
+++ b/evaluation/v6/fixtures.jsonl
@@ -0,0 +1,28 @@
+{"id": "pos_r3_14_vs_15", "text": "This case has fourteen instances:\n- item 1\n- item 2\n- item 3\n- item 4\n- item 5\n- item 6\n- item 7\n- item 8\n- item 9\n- item 10\n- item 11\n- item 12\n- item 13\n- item 14\n- item 15", "expect": "block", "kind": "pos", "note": "real #9"}
+{"id": "pos_r3_six_vs_five", "text": "Six findings:\n- item 1\n- item 2\n- item 3\n- item 4\n- item 5", "expect": "block", "kind": "pos", "note": "real #10"}
+{"id": "pos_r3_all5_4listed", "text": "All 5 tests pass:\n- item 1\n- item 2\n- item 3\n- item 4", "expect": "block", "kind": "pos", "note": ""}
+{"id": "pos_r3_heading_3_vs_4", "text": "## 3 Key Findings\n- item 1\n- item 2\n- item 3\n- item 4", "expect": "block", "kind": "pos", "note": ""}
+{"id": "pos_r3_numbered_two_three", "text": "Two steps remain:\n1. first\n2. second\n3. third", "expect": "block", "kind": "pos", "note": ""}
+{"id": "pos_r1_9_10_80", "text": "Coverage is 9/10 = 80% across the suite.", "expect": "block", "kind": "pos", "note": ""}
+{"id": "pos_r1_3_4_90", "text": "We resolved 3/4 (90%) of the blockers.", "expect": "block", "kind": "pos", "note": ""}
+{"id": "pos_r2_5_of_3", "text": "5 of 3 tests passed in the run.", "expect": "block", "kind": "pos", "note": ""}
+{"id": "pos_r2_words_seven_three", "text": "seven of three reviewers approved", "expect": "block", "kind": "pos", "note": ""}
+{"id": "neg_r3_correct_five", "text": "Five findings:\n- item 1\n- item 2\n- item 3\n- item 4\n- item 5", "expect": "pass", "kind": "neg", "note": ""}
+{"id": "neg_r1_2_35_correct", "text": "True-RUSE rate was 2/35 = 5.7% at first pass.", "expect": "pass", "kind": "neg", "note": "real #9 correct"}
+{"id": "neg_r1_1_4_correct", "text": "1/4 = 25% of the run flaked.", "expect": "pass", "kind": "neg", "note": ""}
+{"id": "neg_vague_several", "text": "Several findings:\n- item 1\n- item 2", "expect": "pass", "kind": "neg", "note": ""}
+{"id": "neg_no_enumeration", "text": "I made three changes to the parser and shipped them.", "expect": "pass", "kind": "neg", "note": ""}
+{"id": "neg_trap_nested_colon", "text": "3 reasons: the top 2 are:\n- x\n- y", "expect": "pass", "kind": "neg", "note": ""}
+{"id": "neg_trap_section_index", "text": "## Section 3 notes\n- a\n- b", "expect": "pass", "kind": "neg", "note": ""}
+{"id": "neg_trap_label_step", "text": "Step 3 tasks:\n- a\n- b", "expect": "pass", "kind": "neg", "note": ""}
+{"id": "neg_trap_version_label", "text": "v4 features:\n- a\n- b\n- c", "expect": "pass", "kind": "neg", "note": ""}
+{"id": "neg_correct_n_of_m", "text": "4 of 7 lanes succeeded cleanly.", "expect": "pass", "kind": "neg", "note": ""}
+{"id": "neg_ratio_no_pct", "text": "see line 12/25 of the file for context", "expect": "pass", "kind": "neg", "note": ""}
+{"id": "neg_nested_3_top", "text": "Three steps:\n- a\n  - a1\n  - a2\n- b\n- c", "expect": "pass", "kind": "neg", "note": ""}
+{"id": "neg_approx_marker", "text": "roughly 2/35 \u2248 6% of cases were diagnostic", "expect": "pass", "kind": "neg", "note": ""}
+{"id": "neg_two_lists_ambiguous", "text": "Two groups follow below.\n\n- a\n- b\n\n- x\n- y\n- z", "expect": "pass", "kind": "neg", "note": ""}
+{"id": "neg_table_correct", "text": "Three rows:\n| h |\n| --- |\n| 1 |\n| 2 |\n| 3 |", "expect": "pass", "kind": "neg", "note": "tables not counted by R3"}
+{"id": "neg_reg_often_overstated", "text": "Benefits are often overstated:\n- a\n- b\n- c", "expect": "pass", "kind": "neg", "note": "MAD FP: 'of-ten' word-boundary"}
+{"id": "neg_reg_one_of_four_quadrants", "text": "Categorize each task into one of four quadrants:\n| | U | NU |\n|---|---|---|\n| I | a | b |", "expect": "pass", "kind": "neg", "note": "MAD FP: mid-line num + table"}
+{"id": "neg_reg_one_side_instead", "text": "I would not favor one side. Instead:\n- a\n- b", "expect": "pass", "kind": "neg", "note": "MAD FP: sentence-colon not count"}
+{"id": "neg_reg_one_study", "text": "One study showed the following:\n- a\n- b", "expect": "pass", "kind": "neg", "note": "MAD FP: count<2 prose"}
diff --git a/evaluation/v6/independent_eval.py b/evaluation/v6/independent_eval.py
new file mode 100644
index 0000000..03bf759
--- /dev/null
+++ b/evaluation/v6/independent_eval.py
@@ -0,0 +1,94 @@
+#!/usr/bin/env python3
+"""Independent (non-circular) precision check for lib/count_drift.py.
+
+Runs the detector over corpora it was NOT authored against:
+  1. evaluation/raw_results.jsonl  — real LLM `model_response` + `prompt_text`
+     (the DarkBench/MAD eval inputs used by the MAST work).
+  2. tests/stress/**/*.json        — stress fixtures authored for the OTHER hooks.
+
+Because these texts have no count-drift ground-truth labels, the meaningful
+metric is the FALSE-POSITIVE RATE: every `block` is printed for inspection. A
+blocking gate is only safe if it (near-)never fires on text that was not written
+to contain a count contradiction.
+
+Usage: python3 evaluation/v6/independent_eval.py
+Exit: 0 if zero blocks, else 1 (so it can gate CI against precision regressions).
+"""
+import glob
+import importlib.util
+import json
+import os
+import sys
+
+HERE = os.path.dirname(os.path.abspath(__file__))
+ROOT = os.path.abspath(os.path.join(HERE, "..", ".."))
+
+spec = importlib.util.spec_from_file_location("count_drift", os.path.join(ROOT, "lib", "count_drift.py"))
+cd = importlib.util.module_from_spec(spec)
+spec.loader.exec_module(cd)
+
+
+def scan(texts, label):
+    n = 0
+    blocks = []
+    for tid, t in texts:
+        if not t or not str(t).strip():
+            continue
+        n += 1
+        r = cd.analyze(str(t))
+        if r["decision"] == "block":
+            blocks.append((tid, r["rule"], r["evidence"]))
+    print("=== %s: %d texts -> %d block ===" % (label, n, len(blocks)))
+    for tid, rule, ev in blocks:
+        print("  BLOCK [%s] %s: %s" % (tid, rule, ev))
+    return n, len(blocks)
+
+
+def mad_texts():
+    path = os.path.join(ROOT, "evaluation", "raw_results.jsonl")
+    out = []
+    if not os.path.exists(path):
+        return out
+    with open(path, encoding="utf-8") as f:
+        for line in f:
+            line = line.strip()
+            if not line:
+                continue
+            d = json.loads(line)
+            pid = d.get("prompt_id", "?")
+            out.append((pid + "/resp", d.get("model_response", "")))
+            out.append((pid + "/prompt", d.get("prompt_text", "")))
+    return out
+
+
+def stress_texts():
+    out = []
+    for p in glob.glob(os.path.join(ROOT, "tests", "stress", "**", "*.json"), recursive=True):
+        try:
+            d = json.load(open(p, encoding="utf-8"))
+        except Exception:
+            continue
+        msg = ""
+        if isinstance(d, dict):
+            msg = d.get("last_assistant_message") or d.get("message") or ""
+            if not msg:
+                strs = [v for v in d.values() if isinstance(v, str)]
+                msg = max(strs, key=len) if strs else ""
+        elif isinstance(d, list):
+            strs = [x for x in d if isinstance(x, str)]
+            msg = max(strs, key=len) if strs else ""
+        out.append((os.path.relpath(p, os.path.join(ROOT, "tests", "stress")), msg))
+    return out
+
+
+def main():
+    n1, b1 = scan(mad_texts(), "MAD raw_results (model_response + prompt_text)")
+    n2, b2 = scan(stress_texts(), "stress fixtures (other hooks)")
+    total, blocks = n1 + n2, b1 + b2
+    print("\nTOTAL independent texts: %d | blocks: %d | false-positive rate: %.4f"
+          % (total, blocks, (blocks / total) if total else 0.0))
+    return 1 if blocks > 0 else 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/evaluation/v6/score_count_drift.py b/evaluation/v6/score_count_drift.py
new file mode 100755
index 0000000..5a823f5
--- /dev/null
+++ b/evaluation/v6/score_count_drift.py
@@ -0,0 +1,157 @@
+#!/usr/bin/env python3
+"""Score lib/count_drift.py against evaluation/v6/fixtures.jsonl.
+
+Computes precision / recall / F1 with a bootstrap 95% CI, writes RESULTS.md,
+and exits non-zero if precision < 1.0 (SC1: a blocking gate must not false-fire).
+
+Usage: python3 evaluation/v6/score_count_drift.py [--write]
+"""
+import importlib.util
+import json
+import os
+import random
+import sys
+
+HERE = os.path.dirname(os.path.abspath(__file__))
+ROOT = os.path.abspath(os.path.join(HERE, "..", ".."))
+
+spec = importlib.util.spec_from_file_location("count_drift", os.path.join(ROOT, "lib", "count_drift.py"))
+cd = importlib.util.module_from_spec(spec)
+spec.loader.exec_module(cd)
+
+
+def load(path):
+    rows = []
+    with open(path, encoding="utf-8") as f:
+        for line in f:
+            line = line.strip()
+            if line:
+                rows.append(json.loads(line))
+    return rows
+
+
+def prf1(items):
+    """items: list of (predicted_block: bool, gold_block: bool)."""
+    tp = sum(1 for p, g in items if p and g)
+    fp = sum(1 for p, g in items if p and not g)
+    fn = sum(1 for p, g in items if not p and g)
+    prec = tp / (tp + fp) if (tp + fp) else 1.0
+    rec = tp / (tp + fn) if (tp + fn) else 1.0
+    f1 = 2 * prec * rec / (prec + rec) if (prec + rec) else 0.0
+    return tp, fp, fn, prec, rec, f1
+
+
+def bootstrap_f1(items, n=1000, seed=42):
+    rng = random.Random(seed)
+    f1s = []
+    m = len(items)
+    for _ in range(n):
+        sample = [items[rng.randrange(m)] for _ in range(m)]
+        f1s.append(prf1(sample)[5])
+    f1s.sort()
+    lo = f1s[int(0.025 * n)]
+    hi = f1s[int(0.975 * n) - 1]
+    return lo, hi
+
+
+def main():
+    rows = load(os.path.join(HERE, "fixtures.jsonl"))
+    items = []
+    failures = []
+    for r in rows:
+        verdict = cd.analyze(r["text"])
+        pred_block = verdict["decision"] == "block"
+        gold_block = r["expect"] == "block"
+        items.append((pred_block, gold_block))
+        if pred_block != gold_block:
+            kind = "FALSE_POSITIVE" if pred_block else "MISS"
+            failures.append((r["id"], kind, verdict.get("rule", "")))
+    tp, fp, fn, prec, rec, f1 = prf1(items)
+    lo, hi = bootstrap_f1(items)
+    n_pos = sum(1 for _, g in items if g)
+    n_neg = len(items) - n_pos
+
+    summary = (
+        "# v6 count-drift — RESULTS\n\n"
+        "Scorer: `evaluation/v6/score_count_drift.py` over `fixtures.jsonl` "
+        "(%d fixtures: %d positive / %d adversarial negative).\n\n"
+        "| metric | value |\n|---|---|\n"
+        "| precision | %.3f |\n| recall | %.3f |\n| F1 | %.3f |\n"
+        "| F1 95%% CI (bootstrap, n=1000, seed=42) | [%.3f, %.3f] |\n"
+        "| true positives | %d |\n| **false positives** | **%d** |\n| misses | %d |\n\n"
+        "SC1 (zero false positives on the adversarial negative set): %s\n"
+        % (len(items), n_pos, n_neg, prec, rec, f1, lo, hi, tp, fp, fn,
+           "PASS" if fp == 0 else "FAIL")
+    )
+    if failures:
+        summary += "\nFailures:\n" + "\n".join(
+            "- %s: %s (%s)" % (fid, kind, rule) for fid, kind, rule in failures) + "\n"
+    # Independent (non-circular) evaluation over corpora the detector was not authored against.
+    try:
+        import importlib.util as _il
+        _spec = _il.spec_from_file_location("independent_eval", os.path.join(HERE, "independent_eval.py"))
+        _ind = _il.module_from_spec(_spec)
+        _spec.loader.exec_module(_ind)
+
+        def _count(texts):
+            tot = blk = 0
+            for _tid, _t in texts:
+                if not _t or not str(_t).strip():
+                    continue
+                tot += 1
+                if cd.analyze(str(_t))["decision"] == "block":
+                    blk += 1
+            return tot, blk
+
+        _mt, _mb = _count(_ind.mad_texts())
+        _st, _sb = _count(_ind.stress_texts())
+        _tot, _blk = _mt + _st, _mb + _sb
+        if _tot:
+            summary += (
+                "\n## Independent evaluation (non-circular)\n\n"
+                "Detector run over corpora it was NOT authored against — real LLM "
+                "`model_response`/`prompt_text` from `evaluation/raw_results.jsonl` and the "
+                "stress fixtures authored for the *other* hooks. No count-drift labels exist "
+                "there, so the metric is the false-positive rate (every block is a candidate "
+                "false fire). Reproduce: `python3 evaluation/v6/independent_eval.py`.\n\n"
+                "| corpus | texts | blocks |\n|---|---|---|\n"
+                "| MAD raw_results | %d | %d |\n"
+                "| stress fixtures (other hooks) | %d | %d |\n"
+                "| **total** | **%d** | **%d** |\n\n"
+                "False-positive rate on independent text: **%.4f**. This is the load-bearing, "
+                "non-circular precision evidence — distinct from the hand-authored F1 below. "
+                "(Two real false positives found during development — a too-loose lead-in and "
+                "a missing word-boundary on number words — were fixed and locked in as "
+                "regression negatives.)\n"
+                % (_mt, _mb, _st, _sb, _tot, _blk, (_blk / _tot) if _tot else 0.0)
+            )
+    except Exception:
+        pass
+    summary += (
+        "\n## Honesty caveat (read before citing F1)\n\n"
+        "This corpus is **hand-authored** — the same author wrote the detector and the "
+        "fixtures — so an F1 of 1.0 here is **not** a wild-generalization claim; it is a "
+        "co-evolved-corpus number and would inflate if cited as field performance. What "
+        "the number legitimately shows: the detector behaves to spec on the designed "
+        "cases, **including the adversarial negatives authored to break it** (nested-colon "
+        "lead-ins, section-index numbers, label words, approximation markers, "
+        "ambiguous multi-list scope, nested-list depth). The load-bearing, "
+        "generalizable metric is **precision / zero-false-positives on those adversarial "
+        "negatives** — the property a blocking gate must hold.\n\n"
+        "Recall is reported, not gated. Per the statcheck precedent (deterministic "
+        "internal-consistency check: ~96-100%% specificity but only ~61%% recall in the "
+        "wild), real-world recall here will be far below 1.0, bounded by structural "
+        "extraction coverage. That trade is intentional: abstain rather than false-fire.\n"
+    )
+
+    print(summary)
+    if "--write" in sys.argv:
+        with open(os.path.join(HERE, "RESULTS.md"), "w", encoding="utf-8") as f:
+            f.write(summary)
+
+    # Gate: a blocking detector must not false-fire.
+    return 1 if fp > 0 else 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/hooks/hooks.json b/hooks/hooks.json
index 0a8695d..0e904e0 100644
--- a/hooks/hooks.json
+++ b/hooks/hooks.json
@@ -1,7 +1,7 @@
 {
   "_comment": [
-    "LLM Dark Patterns Hooks — bundled plugin wiring for all 29 hooks.",
-    "Stop / SubagentStop fan out to 24 closeout-language hooks plus the state-stop continuity refresh.",
+    "LLM Dark Patterns Hooks — bundled plugin wiring for all 30 hooks.",
+    "Stop / SubagentStop fan out to 25 closeout-language hooks plus the state-stop continuity refresh.",
     "TaskCreated wires no-handoff-loop + no-credential-leak. TaskCompleted wires no-ownership-violation.",
     "PreToolUse / PostToolUse wire no-vibes side checks plus no-approval-sneak.",
     "PreCompact / PostCompact / SessionStart wire the no-amnesia continuity branch.",
@@ -19,6 +19,7 @@
           { "type": "command", "command": "bash \"${CLAUDE_PLUGIN_ROOT}/hooks/honest-eta.sh\"", "timeout": 5 },
           { "type": "command", "command": "bash \"${CLAUDE_PLUGIN_ROOT}/hooks/no-fake-recall.sh\"", "timeout": 5 },
           { "type": "command", "command": "bash \"${CLAUDE_PLUGIN_ROOT}/hooks/no-fake-stats.sh\"", "timeout": 5 },
+          { "type": "command", "command": "bash \"${CLAUDE_PLUGIN_ROOT}/hooks/no-count-drift.sh\"", "timeout": 5 },
           { "type": "command", "command": "bash \"${CLAUDE_PLUGIN_ROOT}/hooks/no-fake-cite.sh\"", "timeout": 5 },
           { "type": "command", "command": "bash \"${CLAUDE_PLUGIN_ROOT}/hooks/no-wrap-up.sh\"", "timeout": 5 },
           { "type": "command", "command": "bash \"${CLAUDE_PLUGIN_ROOT}/hooks/no-aggregator-hallucination.sh\"", "timeout": 5 },
@@ -50,6 +51,7 @@
           { "type": "command", "command": "bash \"${CLAUDE_PLUGIN_ROOT}/hooks/honest-eta.sh\"", "timeout": 5 },
           { "type": "command", "command": "bash \"${CLAUDE_PLUGIN_ROOT}/hooks/no-fake-recall.sh\"", "timeout": 5 },
           { "type": "command", "command": "bash \"${CLAUDE_PLUGIN_ROOT}/hooks/no-fake-stats.sh\"", "timeout": 5 },
+          { "type": "command", "command": "bash \"${CLAUDE_PLUGIN_ROOT}/hooks/no-count-drift.sh\"", "timeout": 5 },
           { "type": "command", "command": "bash \"${CLAUDE_PLUGIN_ROOT}/hooks/no-fake-cite.sh\"", "timeout": 5 },
           { "type": "command", "command": "bash \"${CLAUDE_PLUGIN_ROOT}/hooks/no-wrap-up.sh\"", "timeout": 5 },
           { "type": "command", "command": "bash \"${CLAUDE_PLUGIN_ROOT}/hooks/no-aggregator-hallucination.sh\"", "timeout": 5 },
diff --git a/hooks/no-count-drift.sh b/hooks/no-count-drift.sh
new file mode 100755
index 0000000..de12d12
--- /dev/null
+++ b/hooks/no-count-drift.sh
@@ -0,0 +1,53 @@
+#!/bin/bash
+# Claude Code hook: block a count stated in the message that contradicts the
+# message's OWN enumeration or arithmetic (count-vs-enumeration self-consistency).
+#
+# This is a FAITHFULNESS / self-consistency gate (MAST FM-3.2 "no/incomplete
+# verification"), distinct from no-fake-stats, which is a FACTUALITY / citation
+# gate. A citation does not resolve an internal mismatch, and small integers that
+# no-fake-stats ignores are exactly where count drift hides.
+#
+# Deterministic, high-precision, abstain-on-ambiguity: it fires only on
+# unambiguous self-contained mismatches and otherwise passes. The counting logic
+# lives in lib/count_drift.py because counting is a rule-based-symbolic strength
+# and an LLM weakness (errors are self-consistent on resample).
+
+set -euo pipefail
+
+INPUT="$(cat)"
+
+# Fail-open if the toolchain is missing — never break a session.
+command -v jq >/dev/null 2>&1 || exit 0
+command -v python3 >/dev/null 2>&1 || exit 0
+printf '%s' "$INPUT" | jq -e . >/dev/null 2>&1 || exit 0
+
+# Re-entrancy guard, matching sibling Stop hooks.
+if [ "$(printf '%s' "$INPUT" | jq -r '.stop_hook_active // empty' 2>/dev/null)" = "true" ]; then
+  exit 0
+fi
+
+message="$(printf '%s' "$INPUT" | jq -r '.last_assistant_message // empty' 2>/dev/null || true)"
+[ -z "$message" ] && exit 0
+
+CORE="$(cd "$(dirname "$0")" && pwd)/../lib/count_drift.py"
+[ -f "$CORE" ] || exit 0
+
+VERDICT="$(printf '%s' "$message" | python3 "$CORE" 2>/dev/null || true)"
+[ -z "$VERDICT" ] && exit 0
+
+DECISION="$(printf '%s' "$VERDICT" | jq -r '.decision // empty' 2>/dev/null || true)"
+if [ "$DECISION" = "block" ]; then
+  RULE="$(printf '%s' "$VERDICT" | jq -r '.rule // "count_drift"' 2>/dev/null)"
+  EVID="$(printf '%s' "$VERDICT" | jq -r '.evidence // ""' 2>/dev/null)"
+  echo "BLOCKED: a stated count contradicts the message's own enumeration or arithmetic." >&2
+  echo "Matched rule: $RULE" >&2
+  [ -n "$EVID" ] && echo "Evidence: $EVID" >&2
+  echo "" >&2
+  echo "Repair guidance:" >&2
+  echo "- Re-count the enumerated items (or re-check the fraction), then make the stated number match." >&2
+  echo "- If the number and the list intentionally differ, say so explicitly (e.g. exclude a contrast item from the tally)." >&2
+  echo "- This is a self-consistency check, not a citation check — adding a source does not fix an internal mismatch." >&2
+  exit 2
+fi
+
+exit 0
diff --git a/lib/count_drift.py b/lib/count_drift.py
new file mode 100755
index 0000000..c9794ce
--- /dev/null
+++ b/lib/count_drift.py
@@ -0,0 +1,376 @@
+#!/usr/bin/env python3
+"""count_drift.py — deterministic count-vs-enumeration self-consistency gate.
+
+Detects when a count stated in prose contradicts the artifact's OWN content:
+  R1  fraction/percentage arithmetic self-check ("9/10 = 80%" -> wrong, 90%)
+  R2  "N of M" bound check (N > M is impossible)
+  R3  headline count vs a single immediately-following enumeration
+      ("six findings:" then a 5-item list)
+
+Design: high precision, abstain-on-ambiguity. This is a BLOCKING gate, so it
+fires only on unambiguous, self-contained mismatches and otherwise passes.
+Counting lives in deterministic code (LLMs are unreliable at counting and their
+errors are self-consistent; see evaluation/v6/SPEC.md source ledger).
+
+Pure standard library — no third-party dependencies.
+
+Usage:
+  echo "<message text>" | python3 lib/count_drift.py
+  python3 lib/count_drift.py --text "..."   |   --file path
+Output: a single JSON object on stdout:
+  {"decision": "block"|"pass", "rule": "<id>", "evidence": "<short>"}
+Always exits 0; the bash hook maps decision=block -> exit 2.
+"""
+
+import json
+import re
+import sys
+
+# ---------------------------------------------------------------------------
+# Number parsing (digits + spelled-out words, stdlib only).
+# ---------------------------------------------------------------------------
+_UNITS = {
+    "zero": 0, "one": 1, "two": 2, "three": 3, "four": 4, "five": 5,
+    "six": 6, "seven": 7, "eight": 8, "nine": 9, "ten": 10, "eleven": 11,
+    "twelve": 12, "thirteen": 13, "fourteen": 14, "fifteen": 15,
+    "sixteen": 16, "seventeen": 17, "eighteen": 18, "nineteen": 19,
+}
+_TENS = {
+    "twenty": 20, "thirty": 30, "forty": 40, "fifty": 50, "sixty": 60,
+    "seventy": 70, "eighty": 80, "ninety": 90,
+}
+_ORDINALS = {
+    "first": 1, "second": 2, "third": 3, "fourth": 4, "fifth": 5,
+    "sixth": 6, "seventh": 7, "eighth": 8, "ninth": 9, "tenth": 10,
+    "eleventh": 11, "twelfth": 12, "thirteenth": 13, "fourteenth": 14,
+    "fifteenth": 15,
+}
+# Deliberately small special-case lexicon. "a dozen" is unambiguous; the vague
+# words map to None so callers ABSTAIN rather than guess a cardinality.
+_SPECIAL = {"dozen": 12}
+_VAGUE = {"a few", "several", "a couple", "a handful", "some", "many", "various"}
+
+
+def word_to_int(phrase):
+    """Return an int for a spelled-out cardinal/ordinal phrase, else None.
+
+    Handles 0-99, "N hundred", "N thousand", hyphenated tens ("twenty-five"),
+    "a dozen", and ordinals. Returns None for anything ambiguous/unsupported so
+    the caller abstains.
+    """
+    s = phrase.strip().lower().replace("-", " ")
+    if s in _VAGUE:
+        return None
+    if s in _ORDINALS:
+        return _ORDINALS[s]
+    if s in ("a dozen", "one dozen", "dozen"):
+        return 12
+    tokens = [t for t in re.split(r"\s+", s) if t and t not in ("and", "a", "an")]
+    if not tokens:
+        return None
+    total = 0
+    current = 0
+    saw = False
+    for tok in tokens:
+        if tok in _UNITS:
+            current += _UNITS[tok]
+            saw = True
+        elif tok in _TENS:
+            current += _TENS[tok]
+            saw = True
+        elif tok == "hundred":
+            current = (current or 1) * 100
+            saw = True
+        elif tok == "thousand":
+            total += (current or 1) * 1000
+            current = 0
+            saw = True
+        elif tok in _SPECIAL:
+            current += _SPECIAL[tok]
+            saw = True
+        else:
+            return None  # unknown token -> abstain
+    if not saw:
+        return None
+    return total + current
+
+
+def parse_count_token(tok):
+    """Parse a single count token (digits or one spelled word/phrase) -> int|None."""
+    tok = tok.strip()
+    if re.fullmatch(r"\d{1,4}", tok):
+        return int(tok)
+    return word_to_int(tok)
+
+
+# A regex alternation matching a single number word (incl. hyphenated tens) or digits.
+_NUMWORD = (
+    r"(?:\d{1,4}|"
+    r"(?:twenty|thirty|forty|fifty|sixty|seventy|eighty|ninety)(?:[ -](?:one|two|three|four|five|six|seven|eight|nine))?|"
+    r"zero|one|two|three|four|five|six|seven|eight|nine|ten|eleven|twelve|thirteen|"
+    r"fourteen|fifteen|sixteen|seventeen|eighteen|nineteen|"
+    r"a dozen|dozen)"
+)
+
+_APPROX = re.compile(r"(~|≈|≅|\bapprox(?:imately)?\b|\babout\b|\broughly\b|\bor so\b)", re.I)
+
+
+# ---------------------------------------------------------------------------
+# R1 — fraction / percentage arithmetic self-check.
+# ---------------------------------------------------------------------------
+_FRAC_PCT = re.compile(
+    r"(?P<num>\d{1,6})\s*/\s*(?P<den>\d{1,6})\s*"
+    r"(?:=|\(|\bis\b|,|\s)\s*~?\s*(?P<pct>\d{1,3}(?:\.\d+)?)\s*%"
+)
+
+
+def check_r1(text):
+    """Flag 'A/B = P%' when P does not match A/B within rounding tolerance."""
+    for m in _FRAC_PCT.finditer(text):
+        num = int(m.group("num"))
+        den = int(m.group("den"))
+        if den == 0:
+            continue
+        pct_str = m.group("pct")
+        stated = float(pct_str)
+        # Abstain on explicit approximation markers right before the percent.
+        head = text[max(0, m.start()): m.start("pct")]
+        if _APPROX.search(head):
+            continue
+        computed = num / den * 100.0
+        # Half-ULP rounding tolerance at the stated decimal precision, +epsilon.
+        decimals = len(pct_str.split(".")[1]) if "." in pct_str else 0
+        tol = 0.5 * (10 ** (-decimals)) + 1e-9
+        if abs(computed - stated) > tol:
+            return {
+                "decision": "block",
+                "rule": "count_drift.fraction_percent_mismatch",
+                "evidence": "%s = %s%% but %d/%d = %.2f%%" % (
+                    m.group("num") + "/" + m.group("den"), pct_str, num, den, computed),
+            }
+    return None
+
+
+# ---------------------------------------------------------------------------
+# R2 — "N of M" bound check.
+# ---------------------------------------------------------------------------
+_N_OF_M = re.compile(
+    r"\b(?P<n>%s)\s+of\s+(?:the\s+|those\s+|these\s+|all\s+)?(?P<m>%s)\b" % (_NUMWORD, _NUMWORD),
+    re.I,
+)
+
+
+def check_r2(text):
+    """Flag 'N of M' where N > M (impossible)."""
+    for m in _N_OF_M.finditer(text):
+        n = parse_count_token(m.group("n"))
+        mm = parse_count_token(m.group("m"))
+        if n is None or mm is None:
+            continue
+        if n > mm:
+            return {
+                "decision": "block",
+                "rule": "count_drift.n_of_m_exceeds",
+                "evidence": "'%s of %s' — %d exceeds %d" % (
+                    m.group("n"), m.group("m"), n, mm),
+            }
+    return None
+
+
+# ---------------------------------------------------------------------------
+# Enumeration parsing (markdown lists + tables), depth-aware, stdlib only.
+# ---------------------------------------------------------------------------
+_LIST_RE = re.compile(r"^(?P<indent>[ \t]*)(?:[-*+]|\d{1,3}[.)])\s+\S")
+_HEADING_RE = re.compile(r"^\s{0,3}#{1,6}\s")
+
+# Label words: when a number directly follows one of these it is an index/ID,
+# not a count (e.g. "Section 3", "Step 2", "v4", "Figure 1"). Abstain.
+_LABEL_BEFORE = re.compile(
+    r"(?:section|step|part|phase|chapter|figure|fig|table|appendix|item|version|"
+    r"v|level|tier|round|pass|day|group|page|line|note|task|issue|pr|#)\s*$",
+    re.I,
+)
+
+
+def _is_table_sep(line):
+    """A markdown table separator row, for any column count (1+)."""
+    cells = [c.strip() for c in line.strip().strip("|").split("|")]
+    cells = [c for c in cells if c != ""]
+    return bool(cells) and all(re.fullmatch(r":?-{2,}:?", c) for c in cells)
+
+
+def find_enumerations(lines):
+    """Return contiguous enumeration blocks as dicts:
+    {kind, count, start, end}  (count = TOP-LEVEL items / table data rows).
+    Conservative: a blank line or a heading ends a block.
+    """
+    blocks = []
+    i = 0
+    n = len(lines)
+    while i < n:
+        line = lines[i]
+        m = _LIST_RE.match(line)
+        if m:
+            start = i
+            indents = []
+            j = i
+            while j < n:
+                lm = _LIST_RE.match(lines[j])
+                if lm:
+                    indents.append(len(lm.group("indent").replace("\t", "    ")))
+                    j += 1
+                elif lines[j].strip() == "":
+                    break
+                elif _HEADING_RE.match(lines[j]):
+                    break
+                else:
+                    # continuation / lazy line within the list: keep going
+                    j += 1
+            base = min(indents) if indents else 0
+            top = 0
+            k = start
+            while k < j:
+                lm = _LIST_RE.match(lines[k])
+                if lm and len(lm.group("indent").replace("\t", "    ")) == base:
+                    top += 1
+                k += 1
+            blocks.append({"kind": "list", "count": top, "start": start, "end": j})
+            i = j
+            continue
+        # table: contiguous lines containing a pipe, with a separator row
+        if "|" in line and line.strip():
+            start = i
+            j = i
+            while j < n and "|" in lines[j] and lines[j].strip():
+                j += 1
+            tbl = lines[start:j]
+            sep_idx = next((idx for idx, l in enumerate(tbl) if _is_table_sep(l)), None)
+            if sep_idx is not None and sep_idx >= 1:
+                data_rows = len(tbl) - (sep_idx + 1)
+                if data_rows >= 1:
+                    blocks.append({"kind": "table", "count": data_rows,
+                                   "start": start, "end": j})
+            i = max(j, i + 1)
+            continue
+        i += 1
+    return blocks
+
+
+# A count lead-in: "<num> <noun>[ up to 3 plain words]:" where the colon is
+# adjacent to the noun phrase with NO intervening punctuation, number, or
+# sentence break. This rejects prose where a number and a sentence-colon merely
+# co-occur on a line ("...favor one side. Instead:" / "...one of four quadrants:").
+_LEADIN_RE = re.compile(
+    r"\b(?P<num>%s)\s+(?P<noun>[A-Za-z][A-Za-z-]{2,30})"
+    r"(?:[ \t]+[A-Za-z][A-Za-z-]+){0,3}[ \t]*:\s*$" % _NUMWORD,
+    re.I,
+)
+# Number must be the FIRST token of the heading content ("## 3 Key Findings"),
+# not buried after a label ("## Section 3 notes" -> abstain).
+_HEADING_COUNT_RE = re.compile(
+    r"^\s{0,3}#{1,6}\s+(?P<num>%s)\s+(?P<noun>[A-Za-z][A-Za-z-]{2,30})\b" % _NUMWORD,
+    re.I,
+)
+
+
+def check_r3(text, lines, enumerations):
+    """Flag a lead-in count claim immediately followed by exactly one enumeration
+    whose top-level count differs. Abstain on any ambiguity."""
+    for idx, line in enumerate(lines):
+        claim = None
+        is_heading = False
+        m = _LEADIN_RE.search(line)
+        if m:
+            claim = m
+        else:
+            hm = _HEADING_COUNT_RE.match(line)
+            if hm:
+                claim = hm
+                is_heading = True
+        if not claim:
+            continue
+        # Lead-in-only guards (a heading already requires the number to be the
+        # first content token, so no label or second-number can precede it).
+        if not is_heading:
+            # Abstain if the number is an index/ID after a label word ("Step 3 tasks:").
+            if _LABEL_BEFORE.search(line[:claim.start("num")]):
+                continue
+            # Abstain if a SECOND number sits between the noun and the lead-in colon
+            # ("3 reasons: the top 2 are:" — the real enumerand is 2, not 3).
+            if re.search(_NUMWORD, line[claim.end("noun"):claim.end()], re.I):
+                continue
+        stated = parse_count_token(claim.group("num"))
+        if stated is None or stated < 2:
+            continue  # abstain on vague / count < 2 (a "one X:" lead-in is almost always prose)
+        # Scope: from just after this line to the next heading/claim boundary.
+        scope_end = len(lines)
+        for k in range(idx + 1, len(lines)):
+            if _HEADING_RE.match(lines[k]):
+                scope_end = k
+                break
+        # Candidate enumerations: LISTS only (table rows are a poor proxy for a
+        # claimed count — e.g. a 2x2 matrix has 4 cells but 2 rows), that START
+        # within (idx, scope_end) and within a small adjacency gap (<=2 non-empty
+        # lines before the block).
+        cands = []
+        for b in enumerations:
+            if b["kind"] == "list" and idx < b["start"] < scope_end:
+                gap_lines = [l for l in lines[idx + 1:b["start"]] if l.strip()]
+                if len(gap_lines) <= 2:
+                    cands.append(b)
+        # ABSTAIN unless exactly one adjacent candidate enumeration.
+        if len(cands) != 1:
+            continue
+        actual = cands[0]["count"]
+        if actual >= 1 and actual != stated:
+            return {
+                "decision": "block",
+                "rule": "count_drift.headline_enumeration_mismatch",
+                "evidence": "claim '%s %s' but the %s lists %d top-level item(s)" % (
+                    claim.group("num"), claim.group("noun"),
+                    cands[0]["kind"], actual),
+            }
+    return None
+
+
+def analyze(text):
+    lines = text.splitlines()
+    enums = find_enumerations(lines)
+    for check in (lambda: check_r1(text),
+                  lambda: check_r2(text),
+                  lambda: check_r3(text, lines, enums)):
+        res = check()
+        if res:
+            return res
+    return {"decision": "pass", "rule": "", "evidence": ""}
+
+
+def _read_input(argv):
+    if "--text" in argv:
+        return argv[argv.index("--text") + 1]
+    if "--file" in argv:
+        with open(argv[argv.index("--file") + 1], "r", encoding="utf-8", errors="replace") as f:
+            return f.read()
+    return sys.stdin.read()
+
+
+def main():
+    try:
+        text = _read_input(sys.argv[1:])
+    except Exception:
+        print(json.dumps({"decision": "pass", "rule": "", "evidence": ""}))
+        return 0
+    if not text or not text.strip():
+        print(json.dumps({"decision": "pass", "rule": "", "evidence": ""}))
+        return 0
+    try:
+        result = analyze(text)
+    except Exception:
+        # Fail-open: never break a session on a parser bug.
+        result = {"decision": "pass", "rule": "", "evidence": ""}
+    print(json.dumps(result))
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/tests/test-count-drift.sh b/tests/test-count-drift.sh
new file mode 100755
index 0000000..18356c3
--- /dev/null
+++ b/tests/test-count-drift.sh
@@ -0,0 +1,69 @@
+#!/usr/bin/env bash
+# Tests for the no-count-drift hook + lib/count_drift.py core.
+# Run: bash tests/test-count-drift.sh   Exit: 0 on success, 1 on any failure.
+set -uo pipefail
+
+ROOT="$(cd "$(dirname "$0")/.." && pwd)"
+HOOK="$ROOT/hooks/no-count-drift.sh"
+SCORER="$ROOT/evaluation/v6/score_count_drift.py"
+
+PASS=0; FAIL=0; FAILS=()
+assert_exit() { # desc expected actual
+  if [ "$2" = "$3" ]; then
+    PASS=$((PASS + 1)); printf '  PASS  %s\n' "$1"
+  else
+    FAIL=$((FAIL + 1)); FAILS+=("$1")
+    printf '  FAIL  %s (want exit %s, got %s)\n' "$1" "$2" "$3"
+  fi
+}
+run_hook() { # message -> sets RC
+  local msg="$1"
+  printf '%s' "$(jq -n --arg m "$msg" \
+    '{hook_event_name:"Stop",stop_hook_active:false,last_assistant_message:$m}')" \
+    | bash "$HOOK" >/dev/null 2>&1
+  RC=$?
+}
+
+# SC1 + SC2 + SC3: scorer exits 0 only when precision == 1.0 (zero FP) over the
+# adversarial fixture set, with the seeded positives blocked.
+python3 "$SCORER" >/dev/null 2>&1
+assert_exit "scorer: 0 false positives (SC1), seeds blocked (SC2), abstain (SC3)" 0 "$?"
+
+# Hook end-to-end.
+run_hook "$(printf 'Six findings:\n- a\n- b\n- c\n- d\n- e')"
+assert_exit "hook blocks headline-vs-list mismatch (exit 2)" 2 "$RC"
+
+run_hook "$(printf 'Five findings:\n- a\n- b\n- c\n- d\n- e')"
+assert_exit "hook passes a correct count (exit 0)" 0 "$RC"
+
+run_hook "Coverage is 9/10 = 80% overall."
+assert_exit "hook blocks wrong fraction-percent (exit 2)" 2 "$RC"
+
+run_hook "$(printf '3 reasons: the top 2 are:\n- x\n- y')"
+assert_exit "hook abstains on nested-colon trap (exit 0)" 0 "$RC"
+
+# SC4 fail-open paths.
+printf 'not json at all' | bash "$HOOK" >/dev/null 2>&1
+assert_exit "fail-open on non-JSON input (SC4)" 0 "$?"
+
+printf '%s' "$(jq -n '{hook_event_name:"Stop",last_assistant_message:""}')" | bash "$HOOK" >/dev/null 2>&1
+assert_exit "fail-open on empty message (SC4)" 0 "$?"
+
+printf '%s' "$(jq -n --arg m "$(printf 'Six findings:\n- a\n- b')" \
+  '{hook_event_name:"Stop",stop_hook_active:true,last_assistant_message:$m}')" \
+  | bash "$HOOK" >/dev/null 2>&1
+assert_exit "re-entrancy guard: stop_hook_active=true never blocks" 0 "$?"
+
+# SC5 determinism: identical scorer output across two runs.
+A="$(python3 "$SCORER" 2>/dev/null)"
+Bb="$(python3 "$SCORER" 2>/dev/null)"
+[ "$A" = "$Bb" ]
+assert_exit "determinism: identical scorer output twice (SC5)" 0 "$?"
+
+echo ""
+echo "PASS=$PASS FAIL=$FAIL"
+if [ "$FAIL" -ne 0 ]; then
+  printf 'FAILURES: %s\n' "${FAILS[*]}"
+  exit 1
+fi
+echo "ALL TESTS PASSED"