blundergoat · mattyhansen · Jun 8, 2026 · Jun 2, 2026 · Jun 2, 2026 · Jun 3, 2026
diff --git a/.agents/hooks.json b/.agents/hooks.json
@@ -0,0 +1,32 @@
+{
+  "deny-dangerous": {
+    "enabled": true,
+    "PreToolUse": [
+      {
+        "matcher": "run_command|view_file|write_to_file|replace_file_content|multi_replace_file_content",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "bash -c 'gcd=\"$(git rev-parse --git-common-dir 2>/dev/null)\"; root=\"\"; case \"$gcd\" in */.git/modules/*|.git/modules/*) root=\"$(git rev-parse --show-toplevel 2>/dev/null || true)\" ;; /*|[A-Za-z]:/*|[A-Za-z]:\\\\*) gcd=\"${gcd//\\\\//}\"; root=\"$(dirname \"$gcd\")\" ;; *) root=\"$(git rev-parse --show-toplevel 2>/dev/null || true)\" ;; esac; [ -f \"$root/.goat-flow/hooks/deny-dangerous.sh\" ] || root=\"${CLAUDE_PROJECT_DIR:-}\"; [ -f \"$root/.goat-flow/hooks/deny-dangerous.sh\" ] || { printf '\\''{\"decision\":\"deny\",\"reason\":\"Policy hook unavailable: git repository root unavailable.\"}\\n'\\''; exit 0; }; cd \"$root\" || { printf '\\''{\"decision\":\"deny\",\"reason\":\"Policy hook unavailable: git repository root unavailable.\"}\\n'\\''; exit 0; }; bash \"$root/.goat-flow/hooks/deny-dangerous.sh\"'",
+            "timeout": 30
+          }
+        ]
+      }
+    ]
+  },
+  "gruff-code-quality": {
+    "enabled": true,
+    "PostToolUse": [
+      {
+        "matcher": "write_to_file|replace_file_content|multi_replace_file_content",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "bash -c 'gcd=\"$(git rev-parse --git-common-dir 2>/dev/null)\"; root=\"\"; case \"$gcd\" in */.git/modules/*|.git/modules/*) root=\"$(git rev-parse --show-toplevel 2>/dev/null || true)\" ;; /*|[A-Za-z]:/*|[A-Za-z]:\\\\*) gcd=\"${gcd//\\\\//}\"; root=\"$(dirname \"$gcd\")\" ;; *) root=\"$(git rev-parse --show-toplevel 2>/dev/null || true)\" ;; esac; [ -f \"$root/.goat-flow/hooks/gruff-code-quality.sh\" ] || root=\"${CLAUDE_PROJECT_DIR:-}\"; [ -f \"$root/.goat-flow/hooks/gruff-code-quality.sh\" ] || { printf '\\''{\"decision\":\"deny\",\"reason\":\"Policy hook unavailable: git repository root unavailable.\"}\\n'\\''; exit 0; }; cd \"$root\" || { printf '\\''{\"decision\":\"deny\",\"reason\":\"Policy hook unavailable: git repository root unavailable.\"}\\n'\\''; exit 0; }; bash \"$root/.goat-flow/hooks/gruff-code-quality.sh\"'",
+            "timeout": 30
+          }
+        ]
+      }
+    ]
+  }
+}
diff --git a/.agents/skills/goat-critique/SKILL.md b/.agents/skills/goat-critique/SKILL.md
@@ -1,13 +1,13 @@
 ---
 name: goat-critique
 description: "Use when a decision or analysis needs multi-lens critique to surface blind spots before shipping."
-goat-flow-skill-version: "1.9.0"
+goat-flow-skill-version: "1.10.1"
 ---
 # /goat-critique
 
 ## Shared Conventions
 
-Read `.goat-flow/skill-reference/skill-preamble.md` and `.goat-flow/skill-reference/skill-conventions.md` for shared conventions before proceeding.
+Read `.goat-flow/skill-docs/skill-preamble.md` and `.goat-flow/skill-docs/skill-conventions.md` for shared conventions before proceeding.
 
 ## When to Use
 
@@ -22,7 +22,7 @@ Use when a concrete artifact deserves multi-perspective critique before shipping
 **NOT this skill (pre-invocation routing):** Use when deciding which skill to invoke, not after explicit invocation.
 - No artifact exists yet → create one first (goat-review, goat-debug, etc.)
 - Simple factual question → answer directly
-- Trivial artifact (hotfix, single-file change) → consider goat-review instead
+- Trivial artifact (hotfix, single-file change) → consider goat-review instead *(pre-invocation only; once `/goat-critique` is invoked, it runs the full protocol regardless of size — see "Direct invocation is binding" below)*
 
 | Excuse | Reality |
 |--------|---------|
@@ -42,7 +42,7 @@ goat-critique runs in one mode: full delegated, Phases 1-5 plus 5.5 meta-audit a
 **Intake checklist:**
 - Confirm the artifact exists and is concrete (a file, a plan document, a specific set of findings - not a vague idea).
 - Select the critique rubric for the artifact type (see Critique Rubrics below). If unclear, ask the user.
-- Use the preamble's grep-first learning-loop retrieval on relevant `.goat-flow/footguns/` and `.goat-flow/lessons/`; record explicit misses instead of broad-loading buckets.
+- Use the preamble's grep-first learning-loop retrieval on relevant `.goat-flow/learning-loop/footguns/` and `.goat-flow/learning-loop/lessons/`; record explicit misses instead of broad-loading buckets.
 - Delegation consent: proceed directly to Phase 1. Skill-chained entry: skip intake confirmation, use caller context; still run retrieval + rubric selection. All phases (1-5 + 5.5 + 5.6) always run.
 - **Differential mode detection:** Check `.goat-flow/logs/critiques/` for prior critiques of the same artifact slug within 30 days. If found, offer differential mode: A/B receive prior log + artifact diff; C stays cold. Phase 5 adds delta counts and `[diff-of: <prior-uuid>]`.
 - **Read context map:** Read the selected rubric's context map (see `references/rubric-examples.md`) and pass to each sub-agent's spawn directive.

diff --git a/.agents/skills/goat-critique/references/rubric-examples.md b/.agents/skills/goat-critique/references/rubric-examples.md
@@ -1,5 +1,5 @@
 ---
-goat-flow-reference-version: "1.9.0"
+goat-flow-reference-version: "1.10.1"
 ---
 # Critique Rubric Examples (Reference Pack)
 
@@ -10,12 +10,12 @@ goat-flow-reference-version: "1.9.0"
 Each rubric has a context map that Step 0 reads and passes to sub-agent spawn directives. Footgun/lesson entries mean targeted grep-first hits from those buckets, not whole-directory reads. Agent C's isolation enforcement (Phase 2 step 1 grep check) is unchanged regardless of context map. Generic fallback uses the default split.
 
 ### Plan
-- **A:** targeted grep-first footgun/lesson hits, `.goat-flow/decisions/`
-- **B:** `.goat-flow/tasks/.active`, `git log --oneline -20`, milestone logs
+- **A:** targeted grep-first footgun/lesson hits, `.goat-flow/learning-loop/decisions/`
+- **B:** `.goat-flow/plans/.active`, `git log --oneline -20`, milestone logs
 - **C:** [] (isolation enforced)
 
 ### Security assessment
-- **A:** targeted grep-first footgun/lesson hits, threat-model docs, `.goat-flow/decisions/`
+- **A:** targeted grep-first footgun/lesson hits, threat-model docs, `.goat-flow/learning-loop/decisions/`
 - **B:** `git log --oneline -20`, config.yaml, dependency manifests
 - **C:** [] (isolation enforced)
 
@@ -25,17 +25,17 @@ Each rubric has a context map that Step 0 reads and passes to sub-agent spawn di
 - **C:** [] (isolation enforced)
 
 ### Review findings
-- **A:** targeted grep-first footgun/lesson hits, `.goat-flow/decisions/`
+- **A:** targeted grep-first footgun/lesson hits, `.goat-flow/learning-loop/decisions/`
 - **B:** `git log --oneline -20`, config.yaml, CI logs
 - **C:** [] (isolation enforced)
 
 ### Test strategy
-- **A:** targeted grep-first footgun/lesson hits, `.goat-flow/decisions/`
+- **A:** targeted grep-first footgun/lesson hits, `.goat-flow/learning-loop/decisions/`
 - **B:** `git log --oneline -20`, config.yaml, test manifests
 - **C:** [] (isolation enforced)
 
 ### Architecture/refactor
-- **A:** targeted grep-first footgun/lesson hits, `.goat-flow/decisions/`, dependency maps
+- **A:** targeted grep-first footgun/lesson hits, `.goat-flow/learning-loop/decisions/`, dependency maps
 - **B:** `git log --oneline -20`, config.yaml, module boundaries
 - **C:** [] (isolation enforced)
 

diff --git a/.agents/skills/goat-critique/references/sub-agent-directives.md b/.agents/skills/goat-critique/references/sub-agent-directives.md
@@ -1,5 +1,5 @@
 ---
-goat-flow-reference-version: "1.9.0"
+goat-flow-reference-version: "1.10.1"
 ---
 # Critique Sub-Agent Directives (Reference Pack)
 

diff --git a/.agents/skills/goat-debug/SKILL.md b/.agents/skills/goat-debug/SKILL.md
@@ -1,14 +1,14 @@
 ---
 name: goat-debug
-description: "Use when diagnosing a bug, unexpected behaviour, or system failure that needs structured investigation."
-goat-flow-skill-version: "1.9.0"
+description: "Use when diagnosing a bug, unexpected behaviour, system failure, or unfamiliar code that needs structured investigation."
+goat-flow-skill-version: "1.10.1"
 ---
 # /goat-debug
 
 ## Shared Conventions
 
-Read `.goat-flow/skill-reference/skill-preamble.md` for shared conventions.
-On full-depth, also read `.goat-flow/skill-reference/skill-conventions.md`.
+Read `.goat-flow/skill-docs/skill-preamble.md` for shared conventions.
+On full-depth, also read `.goat-flow/skill-docs/skill-conventions.md`.
 
 ## When to Use
 
@@ -33,10 +33,10 @@ Use when diagnosing a bug or understanding unfamiliar code. For onboarding, use
 If depth is pre-decided, proceed. Otherwise confirm quick vs full, or auto-detect from available input.
 If vague, ask about: goal, symptom/error message, area involved.
 
-**Quick path:** diagnose and report; **full path:** run D1–D1.5–D2–D3–D4.
-**Footgun check:** Use the preamble's grep-first learning-loop retrieval on `.goat-flow/footguns/` and `.goat-flow/lessons/` for the target area. Surface matches or an explicit retrieval miss; do not broad-load either bucket.
+**Quick path:** diagnose and report; minimum evidence is primary file read, 2 hypothesis categories tested, reproduction attempted or no-repro gap stated. **Full path:** run D1–D1.5–D2–D3–D4.
+**Footgun check:** Use the preamble's grep-first learning-loop retrieval on `.goat-flow/learning-loop/footguns/` and `.goat-flow/learning-loop/lessons/` for the target area. Surface matches or an explicit retrieval miss; do not broad-load either bucket.
 
-**Browser evidence detection:** Does the request reference a URL, local HTML page, localhost route, screenshot, UI element, visual rendering issue, browser DevTools output, or browser console/network symptom? If yes, read `.goat-flow/skill-playbooks/browser-use.md` for browser evidence tools. Check with `command -v browser-use || command -v browser-use-python`. If not installed, offer to install it (`pip install browser-use` or `scripts/install-browser-tools.sh`) and wait for the user's response - never install it without approval or silently fall back. If the user declines or installation fails, use the manual fallback in the reference.
+**Browser evidence detection:** Does the request reference a URL, local HTML page, localhost route, screenshot, UI element, visual rendering issue, browser DevTools output, or browser console/network symptom? If yes, read `.goat-flow/skill-docs/playbooks/browser-use.md` for browser evidence tools. Check with `command -v browser-use || command -v browser-use-python`. If not installed, offer to install it (`pip install browser-use`) and wait for the user's response - never install it without approval or silently fall back. If the user declines or installation fails, use the manual fallback in the reference.
 
 
 ## Diagnose Mode
@@ -49,7 +49,7 @@ Write 2-3 hypotheses spanning at least 2 of: Data, Logic, Timing, Environment, C
 
 **Multi-component failures** (CI → build → deploy, request → middleware → handler → DB, etc.): instrument each boundary before proposing any fix. For each component boundary, log what data enters and what exits, run once to gather evidence showing WHERE the chain breaks, THEN investigate the specific failing component. Do not guess the failing layer.
 
-**UI-visible bugs:** After writing hypotheses, use browser evidence to confirm or eliminate UI-related hypotheses. Follow the workflow in `.goat-flow/skill-playbooks/browser-use.md`. Browser output is OBSERVED; interpretations remain INFERRED until mapped to `file + semantic anchor`.
+**UI-visible bugs:** After writing hypotheses, use browser evidence to confirm or eliminate UI-related hypotheses. Follow the workflow in `.goat-flow/skill-docs/playbooks/browser-use.md`. Browser output is OBSERVED; interpretations remain INFERRED until mapped to `file + semantic anchor`.
 
 **Can't reproduce after 5 file reads?** Log what you checked, suggest logging additions, ask for more context.
 
@@ -98,7 +98,7 @@ Rerun the **original reproduction** from D2 - a code change is not a fix until t
 
 **3-fix abort rule:** If three independent fixes have failed to resolve the symptom, STOP and reconsider whether the architecture or the root-cause hypothesis is wrong. Do not attempt a fourth patch without first re-entering D1 with a fresh hypothesis set.
 
-**UI bugs:** Rerun the original browser reproduction post-fix. Capture screenshot/state showing the symptom is gone. Follow `.goat-flow/skill-playbooks/browser-use.md`.
+**UI bugs:** Rerun the original browser reproduction post-fix. Capture screenshot/state showing the symptom is gone. Follow `.goat-flow/skill-docs/playbooks/browser-use.md`.
 
 **Proof Gate:** Apply the Proof Gate from `skill-preamble.md` to the "fixed" claim - rerun the original repro, cite the literal output, and downgrade to **UNVERIFIED** if the session cannot execute the proof.