diff --git a/.github/gstack-review/compile-instructions.md b/.github/gstack-review/compile-instructions.md
new file mode 100644
index 00000000..903a0959
--- /dev/null
+++ b/.github/gstack-review/compile-instructions.md
@@ -0,0 +1,109 @@
+# Prompt Compilation Instructions
+
+You are a **prompt compiler**. Your job is to read two gstack skill files and a
+triage classification, then produce a single self-contained system prompt for a
+headless CI code reviewer.
+
+## Inputs You Will Receive
+
+1. **`review/SKILL.md`** — gstack's interactive staff engineer review skill.
+   Contains the review philosophy, checklists, finding classifications, Greptile
+   integration, Codex integration, telemetry hooks, and interactive conversation
+   patterns.
+
+2. **`plan-eng-review/SKILL.md`** — gstack's engineering review skill.
+   Contains architecture heuristics, data flow analysis patterns, test review
+   methodology, failure mode thinking, and engineering principles.
+
+3. **Triage JSON** — The classification output from Step 1, containing:
+   `pr_type`, `risk_level`, `risk_areas`, `review_context`, `suggested_review_depth`,
+   `conversation_summary`, `needs_architecture_review`, `needs_security_review`,
+   `key_files`, and PR metadata.
+
+4. **Review output schema** — The JSON schema that the final review must conform to.
+
+## What to Extract from the Skill Files
+
+### From `review/SKILL.md`, extract and adapt:
+- The **reviewer persona** and mindset (paranoid staff engineer, structural audit)
+- The **review checklist categories** (what to look for in each dimension)
+- The **finding severity classification** rules (critical, major, minor, nit)
+- The **auto-fix vs flag** decision criteria (adapt to: flag everything, fix nothing — this is CI)
+- Any **security-specific checks** mentioned (OWASP patterns, auth, injection, etc.)
+- The **completeness audit** patterns (forgotten enum handlers, missing consumers, etc.)
+
+### From `plan-eng-review/SKILL.md`, extract and adapt:
+- The **architecture heuristics** (boring by default, two-week smell test, etc.)
+- The **data flow tracing** methodology
+- The **state machine / state transition** analysis approach
+- The **failure mode thinking** (what happens when dependencies are down)
+- The **test review criteria** (systems over heroes, coverage philosophy)
+- The **engineering principles** (error budgets, glue work awareness, etc.)
+
+### Ignore / strip out from both files:
+- All `bash` preamble blocks (session management, telemetry, update checks)
+- All `AskUserQuestion` / interactive conversation patterns
+- All Greptile integration logic
+- All Codex / OpenAI integration logic
+- All `gstack-config` / `gstack-review-log` commands
+- All proactive skill suggestion logic
+- All references to `~/.gstack/` directories
+- All `STOP` / `WAIT` / conversation flow control
+- All telemetry event logging
+- Browser / screenshot / QA related sections
+- Version check / upgrade logic
+
+## How to Compile the Prompt
+
+### 1. Set the persona
+Based on the triage `suggested_review_depth`:
+- **`quick`**: Concise reviewer. Focus on correctness and obvious bugs only.
+  Skip deep architecture analysis. Use principles from `review/SKILL.md` only.
+- **`standard`**: Full 5-dimension review. Use both skill files.
+- **`deep`**: Thorough review with edge case analysis. Emphasize failure modes
+  and data flow tracing from `plan-eng-review/SKILL.md`.
+- **`adversarial`**: Everything above plus attacker mindset. Add explicit
+  instructions to think like a malicious user, a chaos engineer, and a
+  tired on-call engineer at 3 AM.
+
+### 2. Emphasize relevant dimensions
+Use the triage `risk_areas` to weight the review:
+- If `security` is in risk_areas → expand the security checklist, add OWASP specifics
+- If `database` → emphasize migration safety, query performance, data integrity
+- If `api_contract` → focus on breaking changes, versioning, consumer impact
+- If `performance` → add N+1 detection, pagination checks, resource leak patterns
+- If `breaking_change` → require rollback analysis
+
+### 3. Handle re-review context
+If `review_context` is `re_review` or `follow_up`:
+- Include the `conversation_summary` from triage
+- Instruct the reviewer to specifically check whether prior feedback was addressed
+- Weight completeness dimension higher
+
+### 4. Scope the file focus
+Use the triage `key_files` list to instruct the reviewer which files deserve
+the closest attention, while still reviewing the full diff.
+
+### 5. Include architecture review conditionally
+Only include the `plan-eng-review` architecture analysis section if
+`needs_architecture_review` is `true` in the triage.
+
+### 6. Embed the output schema
+Include the COMPLETE JSON schema in the compiled prompt so the reviewer
+knows exactly what structure to produce. Remind it that:
+- Output must be ONLY valid JSON, no markdown fences, no preamble
+- Every finding needs file, line, severity, category, title, description
+- The `suggested_fix` field should have concrete code when possible
+- Scores are integers 0-10
+- Summary is 2-3 sentences, human-readable
+- Confidence reflects certainty about the overall verdict
+
+## Output Format
+
+Your output must be ONLY the compiled system prompt text. No markdown fences
+around it. No explanation. No preamble like "Here is the compiled prompt:".
+Just the raw prompt text that will be fed directly to the reviewer model.
+
+The compiled prompt should be self-contained — it must not reference any
+external files, URLs, or tools. Everything the reviewer needs must be
+inline in the prompt.
\ No newline at end of file
diff --git a/.github/gstack-review/review-schema.json b/.github/gstack-review/review-schema.json
new file mode 100644
index 00000000..56e3b503
--- /dev/null
+++ b/.github/gstack-review/review-schema.json
@@ -0,0 +1,141 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#",
+  "title": "GStack PR Review Result",
+  "description": "Structured output from the AI code review step. This is the contract between Step 3 (Claude review) and Step 4 (action routing).",
+  "type": "object",
+  "required": ["verdict", "confidence", "scores", "overall_score", "findings", "summary", "review_metadata"],
+  "properties": {
+    "verdict": {
+      "type": "string",
+      "enum": ["approve", "request_changes", "comment_only"],
+      "description": "The review decision. 'approve' means the PR is ready to merge. 'request_changes' means issues must be addressed. 'comment_only' means feedback is provided but merge is not blocked."
+    },
+    "confidence": {
+      "type": "number",
+      "minimum": 0,
+      "maximum": 1,
+      "description": "Confidence in the verdict. Below 0.7 triggers human-reviewer escalation."
+    },
+    "scores": {
+      "type": "object",
+      "required": ["design", "security", "performance", "test_coverage", "completeness"],
+      "properties": {
+        "design": {
+          "type": "integer", "minimum": 0, "maximum": 10,
+          "description": "Architecture fit, abstraction quality, readability"
+        },
+        "security": {
+          "type": "integer", "minimum": 0, "maximum": 10,
+          "description": "OWASP alignment, input validation, auth, secrets"
+        },
+        "performance": {
+          "type": "integer", "minimum": 0, "maximum": 10,
+          "description": "Query efficiency, resource management, scalability"
+        },
+        "test_coverage": {
+          "type": "integer", "minimum": 0, "maximum": 10,
+          "description": "New code paths tested, edge cases, regression tests"
+        },
+        "completeness": {
+          "type": "integer", "minimum": 0, "maximum": 10,
+          "description": "Does the diff match the PR description? Missing pieces?"
+        }
+      },
+      "additionalProperties": false
+    },
+    "overall_score": {
+      "type": "number",
+      "minimum": 0,
+      "maximum": 10,
+      "description": "Weighted average. Security and completeness weigh more for high-risk PRs."
+    },
+    "findings": {
+      "type": "array",
+      "items": {
+        "type": "object",
+        "required": ["severity", "category", "file", "line", "title", "description"],
+        "properties": {
+          "severity": {
+            "type": "string",
+            "enum": ["critical", "major", "minor", "nit"]
+          },
+          "category": {
+            "type": "string",
+            "enum": ["design", "security", "performance", "test_coverage", "completeness", "correctness", "reliability"]
+          },
+          "file": {
+            "type": "string",
+            "description": "Relative file path from repo root"
+          },
+          "line": {
+            "type": "integer",
+            "minimum": 1,
+            "description": "Line number in the file (use the new file line numbers from the diff)"
+          },
+          "title": {
+            "type": "string",
+            "maxLength": 120,
+            "description": "One-line summary of the finding"
+          },
+          "description": {
+            "type": "string",
+            "maxLength": 1000,
+            "description": "Detailed explanation of the issue and its impact"
+          },
+          "suggested_fix": {
+            "type": "string",
+            "description": "Concrete code suggestion or fix approach. Optional but strongly encouraged."
+          }
+        },
+        "additionalProperties": false
+      }
+    },
+    "summary": {
+      "type": "string",
+      "maxLength": 500,
+      "description": "2-3 sentence human-readable summary of the review outcome"
+    },
+    "review_metadata": {
+      "type": "object",
+      "required": ["pr_type", "review_depth", "files_reviewed", "model_used", "prompt_version"],
+      "properties": {
+        "pr_type": {
+          "type": "string",
+          "description": "PR type from triage step"
+        },
+        "review_depth": {
+          "type": "string",
+          "enum": ["quick", "standard", "deep", "adversarial"],
+          "description": "Review depth from triage step"
+        },
+        "files_reviewed": {
+          "type": "integer",
+          "description": "Number of files included in the review"
+        },
+        "model_used": {
+          "type": "string",
+          "description": "Claude model used for the review"
+        },
+        "prompt_version": {
+          "type": "string",
+          "description": "Version of the prompt template used"
+        },
+        "triage_model": {
+          "type": "string",
+          "description": "Model used for the triage classification step"
+        },
+        "triage_source": {
+          "type": "string",
+          "enum": ["model", "heuristic", "heuristic_fallback"],
+          "description": "Whether triage used the HF model or fell back to heuristics"
+        },
+        "duration_seconds": {
+          "type": "number",
+          "description": "Total review duration in seconds"
+        }
+      },
+      "additionalProperties": false
+    }
+  },
+  "additionalProperties": false
+}
\ No newline at end of file
diff --git a/.github/gstack-review/route-action.sh b/.github/gstack-review/route-action.sh
new file mode 100644
index 00000000..5161c543
--- /dev/null
+++ b/.github/gstack-review/route-action.sh
@@ -0,0 +1,267 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+# ---------------------------------------------------------------------------
+# Step 4 — Action Routing
+#
+# Reads the structured review JSON from Step 3 and takes action:
+#   - Approve / Request Changes / Comment
+#   - Label the PR
+#   - Optionally merge (for auto-mergeable low-risk PRs)
+#   - Post inline comments for findings
+# ---------------------------------------------------------------------------
+
+REVIEW_JSON="${1:?Usage: route-action.sh <review.json> <triage.json>}"
+TRIAGE_JSON="${2:?Usage: route-action.sh <review.json> <triage.json>}"
+
+REPO="${GITHUB_REPOSITORY:?GITHUB_REPOSITORY not set}"
+PR_NUMBER="${PR_NUMBER:?PR_NUMBER not set}"
+AUTO_MERGE_ENABLED="${AUTO_MERGE_ENABLED:-false}"
+
+# Parse review result
+verdict=$(jq -r '.verdict' "$REVIEW_JSON")
+confidence=$(jq -r '.confidence' "$REVIEW_JSON")
+overall_score=$(jq -r '.overall_score' "$REVIEW_JSON")
+summary=$(jq -r '.summary' "$REVIEW_JSON")
+critical_count=$(jq '[.findings[] | select(.severity == "critical")] | length' "$REVIEW_JSON")
+major_count=$(jq '[.findings[] | select(.severity == "major")] | length' "$REVIEW_JSON")
+minor_count=$(jq '[.findings[] | select(.severity == "minor")] | length' "$REVIEW_JSON")
+nit_count=$(jq '[.findings[] | select(.severity == "nit")] | length' "$REVIEW_JSON")
+has_security_critical=$(jq '[.findings[] | select(.severity == "critical" and .category == "security")] | length' "$REVIEW_JSON")
+
+# Parse triage
+pr_type=$(jq -r '.pr_type' "$TRIAGE_JSON")
+auto_mergeable=$(jq -r '.auto_mergeable' "$TRIAGE_JSON")
+risk_level=$(jq -r '.risk_level' "$TRIAGE_JSON")
+review_depth=$(jq -r '.suggested_review_depth' "$TRIAGE_JSON")
+
+# Score table for the comment
+design_score=$(jq -r '.scores.design' "$REVIEW_JSON")
+security_score=$(jq -r '.scores.security' "$REVIEW_JSON")
+performance_score=$(jq -r '.scores.performance' "$REVIEW_JSON")
+test_score=$(jq -r '.scores.test_coverage' "$REVIEW_JSON")
+completeness_score=$(jq -r '.scores.completeness' "$REVIEW_JSON")
+review_model=$(jq -r '.review_metadata.model_used' "$REVIEW_JSON")
+triage_source=$(jq -r '.review_metadata.triage_source // "unknown"' "$REVIEW_JSON")
+
+# ---------------------------------------------------------------------------
+# Build the review comment body
+# ---------------------------------------------------------------------------
+build_comment() {
+    cat <<EOF
+## 🤖 gstack AI Review
+
+| Dimension | Score |
+|-----------|-------|
+| Design | ${design_score}/10 |
+| Security | ${security_score}/10 |
+| Performance | ${performance_score}/10 |
+| Test Coverage | ${test_score}/10 |
+| Completeness | ${completeness_score}/10 |
+| **Overall** | **${overall_score}/10** |
+
+**Verdict:** \`${verdict}\` (confidence: ${confidence})
+**PR Type:** ${pr_type} | **Risk:** ${risk_level} | **Review Depth:** ${review_depth}
+
+### Summary
+${summary}
+
+### Findings
+- 🔴 Critical: ${critical_count}
+- 🟠 Major: ${major_count}
+- 🟡 Minor: ${minor_count}
+- 🔵 Nit: ${nit_count}
+
+$(if [ "$critical_count" -gt 0 ] || [ "$major_count" -gt 0 ]; then
+    echo "### Critical & Major Issues"
+    jq -r '.findings[] | select(.severity == "critical" or .severity == "major") | "- **[\(.severity | ascii_upcase)] \(.title)** (\(.file):\(.line))\n  \(.description)\n"' "$REVIEW_JSON"
+fi)
+
+<details>
+<summary>Review metadata</summary>
+
+- Model: ${review_model}
+- Triage: ${triage_source}
+- Prompt version: $(jq -r '.review_metadata.prompt_version' "$REVIEW_JSON")
+
+</details>
+
+---
+*Automated review by [gstack-pr-pipeline](https://github.com/garrytan/gstack) • Scores are AI-generated and should be verified by a human reviewer*
+EOF
+}
+
+# ---------------------------------------------------------------------------
+# Post inline review comments for findings
+# ---------------------------------------------------------------------------
+post_inline_comments() {
+    local event="COMMENT"
+    local comments_json="[]"
+
+    # Build review comments array for findings that have file and line
+    comments_json=$(jq -c '[
+        .findings[]
+        | select(.file != "" and .line > 0)
+        | {
+            path: .file,
+            line: .line,
+            body: ("**[\(.severity | ascii_upcase)]** \(.title)\n\n\(.description)" + 
+                   (if .suggested_fix then "\n\n💡 **Suggested fix:**\n```\n\(.suggested_fix)\n```" else "" end))
+          }
+    ]' "$REVIEW_JSON")
+
+    local num_comments
+    num_comments=$(echo "$comments_json" | jq 'length')
+
+    if [ "$num_comments" -eq 0 ]; then
+        echo "::notice::No inline comments to post"
+        return
+    fi
+
+    echo "::notice::Posting ${num_comments} inline review comments"
+
+    # Determine review event type based on verdict
+    case "$verdict" in
+        approve)          event="APPROVE" ;;
+        request_changes)  event="REQUEST_CHANGES" ;;
+        *)                event="COMMENT" ;;
+    esac
+
+    # Post as a pull request review with inline comments
+    local review_body
+    review_body=$(build_comment)
+
+    local payload
+    payload=$(jq -n \
+        --arg body "$review_body" \
+        --arg event "$event" \
+        --argjson comments "$comments_json" \
+        --arg commit "$(gh pr view "$PR_NUMBER" --repo "$REPO" --json headRefOid -q '.headRefOid')" \
+        '{
+            body: $body,
+            event: $event,
+            commit_id: $commit,
+            comments: $comments
+        }')
+
+    gh api \
+        --method POST \
+        "/repos/${REPO}/pulls/${PR_NUMBER}/reviews" \
+        --input - <<< "$payload"
+}
+
+# ---------------------------------------------------------------------------
+# Label management
+# ---------------------------------------------------------------------------
+add_label() {
+    local label="$1"
+    gh pr edit "$PR_NUMBER" --repo "$REPO" --add-label "$label" 2>/dev/null || \
+        echo "::warning::Could not add label '${label}' — it may not exist. Create it in repo settings."
+}
+
+remove_label() {
+    local label="$1"
+    gh pr edit "$PR_NUMBER" --repo "$REPO" --remove-label "$label" 2>/dev/null || true
+}
+
+ensure_labels_exist() {
+    local labels=("ai-approved" "ai-review-passed" "needs-work" "needs-human-review" "security-review-needed" "auto-merge-candidate")
+    for label in "${labels[@]}"; do
+        gh label create "$label" --repo "$REPO" --force --description "Auto-managed by gstack-pr-pipeline" 2>/dev/null || true
+    done
+}
+
+# ---------------------------------------------------------------------------
+# Decision logic
+# ---------------------------------------------------------------------------
+echo "::group::Review Decision"
+echo "Verdict: ${verdict}"
+echo "Confidence: ${confidence}"
+echo "Overall Score: ${overall_score}"
+echo "Critical: ${critical_count}, Major: ${major_count}"
+echo "PR Type: ${pr_type}, Risk: ${risk_level}"
+echo "Auto-mergeable (triage): ${auto_mergeable}"
+echo "Auto-merge enabled (repo): ${AUTO_MERGE_ENABLED}"
+echo "::endgroup::"
+
+# Ensure labels exist
+ensure_labels_exist
+
+# Clean up any stale labels from previous runs
+remove_label "ai-approved"
+remove_label "ai-review-passed"
+remove_label "needs-work"
+remove_label "needs-human-review"
+remove_label "security-review-needed"
+remove_label "auto-merge-candidate"
+
+# --- Route 1: Security escalation (always, regardless of verdict) ---
+if [ "$has_security_critical" -gt 0 ]; then
+    echo "::warning::Critical security finding detected — escalating"
+    add_label "security-review-needed"
+    # Don't auto-merge, even if everything else looks fine
+    auto_mergeable="false"
+fi
+
+# --- Route 2: Low confidence — escalate to human ---
+confidence_threshold=$(echo "$confidence" | awk '{print ($1 < 0.7) ? "low" : "ok"}')
+if [ "$confidence_threshold" = "low" ]; then
+    echo "::notice::Low confidence (${confidence}) — posting comment only, requesting human review"
+    post_inline_comments
+    add_label "needs-human-review"
+    echo "action=comment_only" >> "$GITHUB_OUTPUT"
+    exit 0
+fi
+
+# --- Route 3: Approve + auto-merge (highest confidence, lowest risk) ---
+if [ "$verdict" = "approve" ] && \
+   [ "$critical_count" -eq 0 ] && \
+   [ "$major_count" -eq 0 ] && \
+   [ "$auto_mergeable" = "true" ] && \
+   [ "$(echo "$overall_score >= 9" | bc -l)" -eq 1 ]; then
+    echo "::notice::Auto-merge eligible: score ${overall_score}, no critical/major findings, triage approved"
+    post_inline_comments
+    add_label "ai-approved"
+
+    # Only actually merge if the repo-level toggle is enabled
+    if [ "$AUTO_MERGE_ENABLED" = "true" ]; then
+        add_label "auto-merge-candidate"
+        # Enable auto-merge (squash) — GitHub will merge once all other checks pass
+        gh pr merge "$PR_NUMBER" --repo "$REPO" --squash --auto 2>/dev/null && \
+            echo "::notice::Auto-merge enabled for PR #${PR_NUMBER}" || \
+            echo "::warning::Could not enable auto-merge — check branch protection settings"
+        echo "action=auto_merge" >> "$GITHUB_OUTPUT"
+    else
+        echo "::notice::Auto-merge is disabled (AUTO_MERGE_ENABLED=$AUTO_MERGE_ENABLED). PR approved but merge is manual."
+        add_label "ai-review-passed"
+        echo "action=approve" >> "$GITHUB_OUTPUT"
+    fi
+    exit 0
+fi
+
+# --- Route 4: Approve (good score, no blockers, but not auto-merge eligible) ---
+if [ "$verdict" = "approve" ] && \
+   [ "$critical_count" -eq 0 ] && \
+   [ "$(echo "$overall_score >= 7" | bc -l)" -eq 1 ]; then
+    echo "::notice::Approved: score ${overall_score}, no critical findings"
+    post_inline_comments
+    add_label "ai-review-passed"
+    echo "action=approve" >> "$GITHUB_OUTPUT"
+    exit 0
+fi
+
+# --- Route 5: Comment only (moderate issues, non-blocking) ---
+if [ "$verdict" = "comment_only" ] || \
+   ([ "$critical_count" -eq 0 ] && [ "$major_count" -le 2 ]); then
+    echo "::notice::Comment-only review: flagging ${major_count} major, ${minor_count} minor findings"
+    post_inline_comments
+    add_label "needs-human-review"
+    echo "action=comment_only" >> "$GITHUB_OUTPUT"
+    exit 0
+fi
+
+# --- Route 6: Request changes (default for anything with critical/major findings) ---
+echo "::notice::Requesting changes: ${critical_count} critical, ${major_count} major findings"
+post_inline_comments
+add_label "needs-work"
+echo "action=request_changes" >> "$GITHUB_OUTPUT"
\ No newline at end of file
diff --git a/.github/gstack-review/triage.py b/.github/gstack-review/triage.py
new file mode 100644
index 00000000..397b9b25
--- /dev/null
+++ b/.github/gstack-review/triage.py
@@ -0,0 +1,498 @@
+#!/usr/bin/env python3
+"""
+Step 1 — PR Triage via HuggingFace Inference API (Qwen2.5-3B-Instruct)
+
+Classifies a PR by type, risk, and review depth needed.
+Inputs: PR metadata, diff, review comments, conversation, linked issues.
+Output: triage JSON to stdout.
+"""
+
+import json
+import os
+import sys
+import urllib.request
+import urllib.error
+
+# ---------------------------------------------------------------------------
+# Config
+# ---------------------------------------------------------------------------
+HF_MODEL = os.getenv("HF_TRIAGE_MODEL", "Qwen/Qwen2.5-3B-Instruct")
+HF_TOKEN = os.getenv("HF_TOKEN", "")
+GH_TOKEN = os.getenv("GITHUB_TOKEN", "")
+REPO = os.getenv("GITHUB_REPOSITORY", "")  # owner/repo
+PR_NUMBER = os.getenv("PR_NUMBER", "")
+MAX_DIFF_CHARS = 12_000  # keep diff under token budget for a 3B model
+MAX_COMMENT_CHARS = 4_000
+MAX_ISSUE_CHARS = 2_000
+
+# ---------------------------------------------------------------------------
+# GitHub API helpers
+# ---------------------------------------------------------------------------
+def gh_api(path: str) -> dict | list | str:
+    """GET from GitHub REST API v3."""
+    url = f"https://api.github.com{path}"
+    req = urllib.request.Request(url, headers={
+        "Authorization": f"Bearer {GH_TOKEN}",
+        "Accept": "application/vnd.github.v3+json",
+        "X-GitHub-Api-Version": "2022-11-28",
+    })
+    try:
+        with urllib.request.urlopen(req, timeout=30) as resp:
+            return json.loads(resp.read().decode())
+    except urllib.error.HTTPError as e:
+        print(f"::warning::GitHub API error for {path}: {e.code}", file=sys.stderr)
+        return {} if "pulls" in path else []
+
+
+def gh_api_raw(path: str) -> str:
+    """GET raw diff from GitHub API."""
+    url = f"https://api.github.com{path}"
+    req = urllib.request.Request(url, headers={
+        "Authorization": f"Bearer {GH_TOKEN}",
+        "Accept": "application/vnd.github.v3.diff",
+        "X-GitHub-Api-Version": "2022-11-28",
+    })
+    try:
+        with urllib.request.urlopen(req, timeout=60) as resp:
+            return resp.read().decode(errors="replace")
+    except urllib.error.HTTPError as e:
+        print(f"::warning::GitHub API diff error: {e.code}", file=sys.stderr)
+        return ""
+
+
+# ---------------------------------------------------------------------------
+# Gather PR context
+# ---------------------------------------------------------------------------
+def gather_context() -> dict:
+    """Collect everything the triage model needs."""
+    pr = gh_api(f"/repos/{REPO}/pulls/{PR_NUMBER}")
+    if not pr:
+        sys.exit("ERROR: could not fetch PR metadata")
+
+    # Basic metadata
+    ctx = {
+        "title": pr.get("title", ""),
+        "body": (pr.get("body") or "")[:2000],
+        "author": pr.get("user", {}).get("login", "unknown"),
+        "base_branch": pr.get("base", {}).get("ref", "main"),
+        "head_branch": pr.get("head", {}).get("ref", ""),
+        "labels": [l["name"] for l in pr.get("labels", [])],
+        "draft": pr.get("draft", False),
+        "additions": pr.get("additions", 0),
+        "deletions": pr.get("deletions", 0),
+        "changed_files_count": pr.get("changed_files", 0),
+    }
+
+    # Changed files list
+    files = gh_api(f"/repos/{REPO}/pulls/{PR_NUMBER}/files")
+    ctx["changed_files"] = [
+        {"name": f["filename"], "status": f["status"], "additions": f["additions"], "deletions": f["deletions"]}
+        for f in (files if isinstance(files, list) else [])
+    ][:50]  # cap at 50 files
+
+    # Diff (truncated)
+    diff = gh_api_raw(f"/repos/{REPO}/pulls/{PR_NUMBER}")
+    ctx["diff_truncated"] = diff[:MAX_DIFF_CHARS]
+    ctx["diff_total_chars"] = len(diff)
+
+    # Review comments (inline review threads)
+    review_comments = gh_api(f"/repos/{REPO}/pulls/{PR_NUMBER}/comments")
+    if isinstance(review_comments, list):
+        ctx["review_comments"] = [
+            {"user": c.get("user", {}).get("login", ""), "body": (c.get("body") or "")[:500], "path": c.get("path", "")}
+            for c in review_comments
+        ][:20]
+    else:
+        ctx["review_comments"] = []
+
+    # Issue/PR conversation comments
+    issue_comments = gh_api(f"/repos/{REPO}/issues/{PR_NUMBER}/comments")
+    if isinstance(issue_comments, list):
+        ctx["conversation"] = [
+            {"user": c.get("user", {}).get("login", ""), "body": (c.get("body") or "")[:500]}
+            for c in issue_comments
+        ][:20]
+    else:
+        ctx["conversation"] = []
+
+    # Linked issues (parse from PR body — GitHub doesn't have a direct API for this)
+    ctx["linked_issues"] = extract_linked_issues(ctx["body"])
+
+    # Fetch linked issue details
+    linked_issue_details = []
+    for issue_num in ctx["linked_issues"][:5]:  # cap at 5
+        issue = gh_api(f"/repos/{REPO}/issues/{issue_num}")
+        if isinstance(issue, dict) and issue.get("title"):
+            linked_issue_details.append({
+                "number": issue_num,
+                "title": issue.get("title", ""),
+                "body": (issue.get("body") or "")[:500],
+                "labels": [l["name"] for l in issue.get("labels", [])],
+            })
+    ctx["linked_issue_details"] = linked_issue_details
+
+    return ctx
+
+
+def extract_linked_issues(body: str) -> list[int]:
+    """Extract issue numbers from common linking patterns in PR body."""
+    import re
+    patterns = [
+        r"(?:close[sd]?|fix(?:e[sd])?|resolve[sd]?)\s+#(\d+)",
+        r"(?:close[sd]?|fix(?:e[sd])?|resolve[sd]?)\s+https?://github\.com/[^/]+/[^/]+/issues/(\d+)",
+        r"#(\d+)",  # generic issue references
+    ]
+    issues = []
+    for pattern in patterns:
+        for match in re.finditer(pattern, body, re.IGNORECASE):
+            num = int(match.group(1))
+            if num not in issues and num != int(PR_NUMBER):
+                issues.append(num)
+    return issues[:10]
+
+
+# ---------------------------------------------------------------------------
+# HuggingFace Inference API call
+# ---------------------------------------------------------------------------
+def call_hf_model(prompt: str) -> str:
+    """Call HuggingFace Inference API with the triage prompt."""
+    url = f"https://router.huggingface.co/novita/v3/openai/chat/completions"
+
+    payload = json.dumps({
+        "model": HF_MODEL,
+        "messages": [
+            {"role": "system", "content": SYSTEM_PROMPT},
+            {"role": "user", "content": prompt},
+        ],
+        "max_tokens": 1024,
+        "temperature": 0.1,  # near-deterministic for classification
+    }).encode()
+
+    headers = {
+        "Content-Type": "application/json",
+    }
+    if HF_TOKEN:
+        headers["Authorization"] = f"Bearer {HF_TOKEN}"
+
+    req = urllib.request.Request(url, data=payload, headers=headers, method="POST")
+
+    try:
+        with urllib.request.urlopen(req, timeout=60) as resp:
+            result = json.loads(resp.read().decode())
+            return result["choices"][0]["message"]["content"]
+    except urllib.error.HTTPError as e:
+        body = e.read().decode(errors="replace")
+        print(f"::error::HuggingFace API error {e.code}: {body}", file=sys.stderr)
+        sys.exit(1)
+    except Exception as e:
+        print(f"::error::HuggingFace API call failed: {e}", file=sys.stderr)
+        sys.exit(1)
+
+
+# ---------------------------------------------------------------------------
+# Prompts
+# ---------------------------------------------------------------------------
+SYSTEM_PROMPT = """You are a PR triage classifier. You analyze pull request metadata, diffs, 
+review comments, conversations, and linked issues to produce a structured classification.
+
+You MUST respond with ONLY valid JSON — no markdown, no explanation, no preamble.
+
+JSON schema:
+{
+  "pr_type": "feature" | "bugfix" | "refactor" | "dependency" | "docs" | "config" | "test" | "hotfix",
+  "size": "trivial" | "small" | "medium" | "large" | "massive",
+  "risk_level": "low" | "medium" | "high" | "critical",
+  "risk_areas": ["security", "database", "api_contract", "auth", "payments", "data_loss", "performance", "breaking_change"],
+  "review_context": "fresh" | "re_review" | "follow_up" | "draft",
+  "conversation_summary": "One sentence summarizing review conversation so far, or empty string if none",
+  "needs_architecture_review": true | false,
+  "needs_security_review": true | false,
+  "auto_mergeable": true | false,
+  "suggested_review_depth": "quick" | "standard" | "deep" | "adversarial",
+  "key_files": ["list of most important changed files to focus review on"],
+  "reasoning": "Brief explanation of classification decisions"
+}
+
+Classification rules:
+- trivial: <=10 lines, docs/config only
+- small: <=50 lines changed
+- medium: 51-300 lines
+- large: 301-1000 lines  
+- massive: >1000 lines
+- auto_mergeable: ONLY if docs/deps/config, no logic changes, trivial size, low risk, no outstanding review comments requesting changes
+- needs_architecture_review: true if PR adds new modules, changes data models, modifies API contracts, or restructures code
+- needs_security_review: true if PR touches auth, crypto, user input handling, SQL/DB queries, secrets, or payment logic
+- risk_level: critical if touching auth/payments/data-loss-paths, high if API changes or DB migrations, medium for feature code, low for docs/config/tests
+- suggested_review_depth: quick for trivial/low-risk, standard for most, deep for large or high-risk, adversarial for critical risk
+- If review comments show unresolved concerns, set review_context to "re_review" and summarize what was requested
+- key_files: pick the 3-5 most important files from the diff that deserve the closest review attention"""
+
+
+def build_user_prompt(ctx: dict) -> str:
+    """Build the user prompt from collected PR context."""
+    sections = []
+
+    # PR metadata
+    sections.append(f"""## PR Metadata
+- Title: {ctx['title']}
+- Author: {ctx['author']}
+- Base: {ctx['base_branch']} ← Head: {ctx['head_branch']}
+- Labels: {', '.join(ctx['labels']) or 'none'}
+- Draft: {ctx['draft']}
+- Stats: +{ctx['additions']} -{ctx['deletions']} across {ctx['changed_files_count']} files""")
+
+    # PR description
+    if ctx['body']:
+        sections.append(f"## PR Description\n{ctx['body'][:1500]}")
+
+    # Changed files
+    if ctx['changed_files']:
+        file_list = "\n".join(
+            f"  - {f['name']} ({f['status']}, +{f['additions']}/-{f['deletions']})"
+            for f in ctx['changed_files'][:30]
+        )
+        sections.append(f"## Changed Files\n{file_list}")
+
+    # Diff excerpt
+    if ctx['diff_truncated']:
+        sections.append(f"## Diff (first {MAX_DIFF_CHARS} chars of {ctx['diff_total_chars']} total)\n```diff\n{ctx['diff_truncated']}\n```")
+
+    # Review comments
+    if ctx['review_comments']:
+        comments = "\n".join(
+            f"  - @{c['user']} on `{c['path']}`: {c['body'][:300]}"
+            for c in ctx['review_comments']
+        )
+        sections.append(f"## Inline Review Comments\n{comments}")
+
+    # Conversation
+    if ctx['conversation']:
+        convo = "\n".join(
+            f"  - @{c['user']}: {c['body'][:300]}"
+            for c in ctx['conversation']
+        )
+        sections.append(f"## PR Conversation\n{convo}")
+
+    # Linked issues
+    if ctx['linked_issue_details']:
+        issues = "\n".join(
+            f"  - #{i['number']}: {i['title']} (labels: {', '.join(i['labels']) or 'none'})\n    {i['body'][:300]}"
+            for i in ctx['linked_issue_details']
+        )
+        sections.append(f"## Linked Issues\n{issues}")
+
+    sections.append("\nClassify this PR. Respond with ONLY the JSON object.")
+    return "\n\n".join(sections)
+
+
+# ---------------------------------------------------------------------------
+# Fallback classifier (if HF API fails or token not set)
+# ---------------------------------------------------------------------------
+def heuristic_fallback(ctx: dict) -> dict:
+    """Rule-based fallback triage when HF model is unavailable."""
+    total_changes = ctx["additions"] + ctx["deletions"]
+
+    # Size classification
+    if total_changes <= 10:
+        size = "trivial"
+    elif total_changes <= 50:
+        size = "small"
+    elif total_changes <= 300:
+        size = "medium"
+    elif total_changes <= 1000:
+        size = "large"
+    else:
+        size = "massive"
+
+    # File-based heuristics
+    file_names = [f["name"].lower() for f in ctx.get("changed_files", [])]
+    all_files_str = " ".join(file_names)
+
+    is_docs_only = all(
+        f.endswith((".md", ".txt", ".rst", ".adoc", ".mdx"))
+        for f in file_names
+    ) if file_names else False
+
+    is_deps_only = all(
+        any(dep in f for dep in ["package.json", "requirements", "gemfile", "cargo.toml", "go.sum", "go.mod", "pom.xml", "build.gradle", ".lock", "yarn.lock", "bun.lock"])
+        for f in file_names
+    ) if file_names else False
+
+    is_config_only = all(
+        any(cfg in f for cfg in [".yml", ".yaml", ".toml", ".ini", ".env", ".config", "dockerfile", ".dockerignore", ".gitignore"])
+        for f in file_names
+    ) if file_names else False
+
+    is_test_only = all(
+        any(t in f for t in ["test", "spec", "__tests__", "_test."])
+        for f in file_names
+    ) if file_names else False
+
+    # PR type
+    title_lower = (ctx.get("title") or "").lower()
+    if is_docs_only:
+        pr_type = "docs"
+    elif is_deps_only:
+        pr_type = "dependency"
+    elif is_config_only:
+        pr_type = "config"
+    elif is_test_only:
+        pr_type = "test"
+    elif any(w in title_lower for w in ["fix", "bug", "patch", "hotfix"]):
+        pr_type = "hotfix" if "hotfix" in title_lower else "bugfix"
+    elif any(w in title_lower for w in ["refactor", "cleanup", "rename"]):
+        pr_type = "refactor"
+    else:
+        pr_type = "feature"
+
+    # Risk areas
+    risk_areas = []
+    security_keywords = ["auth", "jwt", "token", "password", "secret", "crypt", "oauth", "session", "cookie", "cors", "csrf"]
+    db_keywords = ["migration", "schema", "model", "query", "sql", "database", "prisma", "typeorm", "sequelize", "knex"]
+    api_keywords = ["route", "endpoint", "controller", "handler", "api", "graphql", "grpc"]
+    payment_keywords = ["payment", "stripe", "billing", "invoice", "subscription", "charge"]
+
+    if any(k in all_files_str for k in security_keywords):
+        risk_areas.append("security")
+    if any(k in all_files_str for k in db_keywords):
+        risk_areas.append("database")
+    if any(k in all_files_str for k in api_keywords):
+        risk_areas.append("api_contract")
+    if any(k in all_files_str for k in payment_keywords):
+        risk_areas.append("payments")
+
+    # Risk level
+    if "payments" in risk_areas or "security" in risk_areas:
+        risk_level = "critical" if size in ("large", "massive") else "high"
+    elif "database" in risk_areas or "api_contract" in risk_areas:
+        risk_level = "high"
+    elif pr_type in ("docs", "config", "test", "dependency"):
+        risk_level = "low"
+    elif size in ("large", "massive"):
+        risk_level = "high"
+    else:
+        risk_level = "medium"
+
+    # Review context
+    has_review_comments = bool(ctx.get("review_comments"))
+    has_change_requests = any(
+        any(w in (c.get("body") or "").lower() for w in ["please", "should", "fix", "change", "update", "wrong", "incorrect"])
+        for c in ctx.get("review_comments", [])
+    )
+    if ctx.get("draft"):
+        review_context = "draft"
+    elif has_change_requests:
+        review_context = "re_review"
+    elif has_review_comments:
+        review_context = "follow_up"
+    else:
+        review_context = "fresh"
+
+    # Auto-merge eligibility
+    auto_mergeable = (
+        pr_type in ("docs", "dependency", "config")
+        and size in ("trivial", "small")
+        and risk_level == "low"
+        and not has_change_requests
+        and not ctx.get("draft")
+    )
+
+    # Review depth
+    if risk_level == "critical":
+        depth = "adversarial"
+    elif risk_level == "high" or size in ("large", "massive"):
+        depth = "deep"
+    elif risk_level == "low" and size in ("trivial", "small"):
+        depth = "quick"
+    else:
+        depth = "standard"
+
+    # Key files (largest changes first)
+    key_files = sorted(
+        ctx.get("changed_files", []),
+        key=lambda f: f["additions"] + f["deletions"],
+        reverse=True,
+    )[:5]
+
+    # Conversation summary
+    convo_summary = ""
+    if has_change_requests:
+        last_review = ctx.get("review_comments", [])[-1] if ctx.get("review_comments") else {}
+        convo_summary = f"@{last_review.get('user', 'reviewer')} requested changes on {last_review.get('path', 'unknown file')}"
+
+    return {
+        "pr_type": pr_type,
+        "size": size,
+        "risk_level": risk_level,
+        "risk_areas": risk_areas,
+        "review_context": review_context,
+        "conversation_summary": convo_summary,
+        "needs_architecture_review": pr_type == "feature" and size in ("large", "massive"),
+        "needs_security_review": bool(set(risk_areas) & {"security", "payments", "auth"}),
+        "auto_mergeable": auto_mergeable,
+        "suggested_review_depth": depth,
+        "key_files": [f["name"] for f in key_files],
+        "reasoning": f"Heuristic fallback: {pr_type} PR, {size} size, {risk_level} risk. Touched: {', '.join(risk_areas) or 'no high-risk areas'}.",
+    }
+
+
+# ---------------------------------------------------------------------------
+# Main
+# ---------------------------------------------------------------------------
+def main():
+    if not REPO or not PR_NUMBER:
+        sys.exit("ERROR: GITHUB_REPOSITORY and PR_NUMBER must be set")
+
+    print(f"::group::Gathering PR context for {REPO}#{PR_NUMBER}", file=sys.stderr)
+    ctx = gather_context()
+    print(f"::endgroup::", file=sys.stderr)
+
+    # Try HuggingFace model first, fall back to heuristics
+    if HF_TOKEN:
+        print(f"::group::Calling {HF_MODEL} for triage", file=sys.stderr)
+        prompt = build_user_prompt(ctx)
+        raw_response = call_hf_model(prompt)
+        print(f"::endgroup::", file=sys.stderr)
+
+        # Parse JSON from model response
+        try:
+            # Strip markdown fences if present
+            cleaned = raw_response.strip()
+            if cleaned.startswith("```"):
+                cleaned = cleaned.split("\n", 1)[1] if "\n" in cleaned else cleaned
+                if cleaned.endswith("```"):
+                    cleaned = cleaned[:-3]
+                cleaned = cleaned.strip()
+            triage = json.loads(cleaned)
+            triage["_source"] = "model"
+            triage["_model"] = HF_MODEL
+        except json.JSONDecodeError:
+            print(f"::warning::Model returned invalid JSON, falling back to heuristics. Raw: {raw_response[:500]}", file=sys.stderr)
+            triage = heuristic_fallback(ctx)
+            triage["_source"] = "heuristic_fallback"
+    else:
+        print("::notice::No HF_TOKEN set, using heuristic triage", file=sys.stderr)
+        triage = heuristic_fallback(ctx)
+        triage["_source"] = "heuristic"
+
+    # Inject PR metadata for downstream steps
+    triage["_pr"] = {
+        "number": int(PR_NUMBER),
+        "title": ctx["title"],
+        "author": ctx["author"],
+        "base_branch": ctx["base_branch"],
+        "head_branch": ctx["head_branch"],
+        "additions": ctx["additions"],
+        "deletions": ctx["deletions"],
+        "changed_files_count": ctx["changed_files_count"],
+        "has_linked_issues": bool(ctx.get("linked_issue_details")),
+        "has_review_comments": bool(ctx.get("review_comments")),
+        "has_conversation": bool(ctx.get("conversation")),
+    }
+
+    # Output
+    print(json.dumps(triage, indent=2))
+
+
+if __name__ == "__main__":
+    main()
\ No newline at end of file
diff --git a/.github/workflows/gstack-pr-review.yml b/.github/workflows/gstack-pr-review.yml
new file mode 100644
index 00000000..76969c92
--- /dev/null
+++ b/.github/workflows/gstack-pr-review.yml
@@ -0,0 +1,319 @@
+# =============================================================================
+# gstack PR Review Pipeline
+#
+# A multi-step AI-powered PR review workflow inspired by gstack's review skills.
+#
+# Step 1: Triage    — HuggingFace Qwen2.5-3B classifies the PR
+# Step 2: Compile   — Claude Sonnet compiles a tailored review prompt
+# Step 3: Review    — Claude reviews the PR and produces structured JSON
+# Step 4: Route     — Deterministic script approves/rejects/comments/merges
+#
+# Required secrets:
+#   ANTHROPIC_API_KEY  — For Claude Code Action (Steps 2 & 3)
+#   HF_TOKEN           — For HuggingFace Inference API (Step 1)
+#
+# Optional secrets:
+#   APP_ID + APP_PRIVATE_KEY — GitHub App for PR approval (recommended)
+#
+# Repository variables (Settings → Variables → Actions):
+#   AUTO_MERGE_ENABLED — Set to "true" to allow auto-merge. Default: "false"
+#
+# Required labels (auto-created by the workflow):
+#   ai-approved, ai-review-passed, needs-work, needs-human-review,
+#   security-review-needed, auto-merge-candidate
+# =============================================================================
+
+name: "gstack PR Review"
+
+on:
+  pull_request:
+    types: [opened, synchronize, ready_for_review]
+  # Allow re-triggering via @gstack-review comment
+  issue_comment:
+    types: [created]
+
+# Cancel in-progress runs for the same PR when new commits are pushed
+concurrency:
+  group: gstack-review-${{ github.event.pull_request.number || github.event.issue.number }}
+  cancel-in-progress: true
+
+permissions:
+  contents: read
+  pull-requests: write
+  issues: write
+
+jobs:
+  # ===========================================================================
+  # Gate: Should we run?
+  # ===========================================================================
+  should-review:
+    runs-on: ubuntu-latest
+    outputs:
+      run: ${{ steps.check.outputs.run }}
+      pr_number: ${{ steps.check.outputs.pr_number }}
+    steps:
+      - id: check
+        run: |
+          # PR events: always run (unless draft)
+          if [ "${{ github.event_name }}" = "pull_request" ]; then
+            if [ "${{ github.event.pull_request.draft }}" = "true" ]; then
+              echo "run=false" >> "$GITHUB_OUTPUT"
+              echo "::notice::Skipping draft PR"
+            else
+              echo "run=true" >> "$GITHUB_OUTPUT"
+              echo "pr_number=${{ github.event.pull_request.number }}" >> "$GITHUB_OUTPUT"
+            fi
+          fi
+
+          # Comment events: only on @gstack-review trigger
+          if [ "${{ github.event_name }}" = "issue_comment" ]; then
+            COMMENT_BODY="${{ github.event.comment.body }}"
+            if echo "$COMMENT_BODY" | grep -qi "@gstack-review"; then
+              # Verify the comment is on a PR, not a plain issue
+              if [ -n "${{ github.event.issue.pull_request }}" ]; then
+                echo "run=true" >> "$GITHUB_OUTPUT"
+                echo "pr_number=${{ github.event.issue.number }}" >> "$GITHUB_OUTPUT"
+              else
+                echo "run=false" >> "$GITHUB_OUTPUT"
+              fi
+            else
+              echo "run=false" >> "$GITHUB_OUTPUT"
+            fi
+          fi
+
+  # ===========================================================================
+  # Step 1: Triage — Classify the PR using Qwen2.5-3B
+  # ===========================================================================
+  triage:
+    needs: should-review
+    if: needs.should-review.outputs.run == 'true'
+    runs-on: ubuntu-latest
+    outputs:
+      triage_json: ${{ steps.run-triage.outputs.triage }}
+      pr_type: ${{ steps.parse.outputs.pr_type }}
+      review_depth: ${{ steps.parse.outputs.review_depth }}
+      auto_mergeable: ${{ steps.parse.outputs.auto_mergeable }}
+      risk_level: ${{ steps.parse.outputs.risk_level }}
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+
+      - name: Run triage classifier
+        id: run-triage
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          HF_TOKEN: ${{ secrets.HF_TOKEN }}
+          PR_NUMBER: ${{ needs.should-review.outputs.pr_number }}
+          HF_TRIAGE_MODEL: "Qwen/Qwen2.5-3B-Instruct"
+        run: |
+          TRIAGE_OUTPUT=$(python .github/gstack-review/triage.py)
+          echo "$TRIAGE_OUTPUT" > /tmp/triage.json
+          echo "triage<<TRIAGE_EOF" >> "$GITHUB_OUTPUT"
+          echo "$TRIAGE_OUTPUT" >> "$GITHUB_OUTPUT"
+          echo "TRIAGE_EOF" >> "$GITHUB_OUTPUT"
+
+          # Log for debugging
+          echo "::group::Triage Result"
+          echo "$TRIAGE_OUTPUT" | jq .
+          echo "::endgroup::"
+
+      - name: Parse triage outputs
+        id: parse
+        run: |
+          echo "pr_type=$(echo '${{ steps.run-triage.outputs.triage }}' | jq -r '.pr_type')" >> "$GITHUB_OUTPUT"
+          echo "review_depth=$(echo '${{ steps.run-triage.outputs.triage }}' | jq -r '.suggested_review_depth')" >> "$GITHUB_OUTPUT"
+          echo "auto_mergeable=$(echo '${{ steps.run-triage.outputs.triage }}' | jq -r '.auto_mergeable')" >> "$GITHUB_OUTPUT"
+          echo "risk_level=$(echo '${{ steps.run-triage.outputs.triage }}' | jq -r '.risk_level')" >> "$GITHUB_OUTPUT"
+
+      - name: Upload triage artifact
+        uses: actions/upload-artifact@v4
+        with:
+          name: triage-result
+          path: /tmp/triage.json
+          retention-days: 30
+
+  # ===========================================================================
+  # Step 2: Prompt Compilation — Claude Sonnet tailors the review prompt
+  # ===========================================================================
+  compile-prompt:
+    needs: [should-review, triage]
+    runs-on: ubuntu-latest
+    outputs:
+      compiled_prompt: ${{ steps.compile.outputs.prompt }}
+    steps:
+      # Checkout main branch — always use the canonical skill files,
+      # never the PR branch (which might have modified them)
+      - uses: actions/checkout@v4
+        with:
+          ref: main
+
+      - name: Verify skill files exist
+        run: |
+          if [ ! -f "review/SKILL.md" ]; then
+            echo "::error::review/SKILL.md not found on main branch"
+            exit 1
+          fi
+          if [ ! -f "plan-eng-review/SKILL.md" ]; then
+            echo "::error::plan-eng-review/SKILL.md not found on main branch"
+            exit 1
+          fi
+          echo "::notice::Skill files found — review/SKILL.md ($(wc -l < review/SKILL.md) lines), plan-eng-review/SKILL.md ($(wc -l < plan-eng-review/SKILL.md) lines)"
+
+      - name: Download triage artifact
+        uses: actions/download-artifact@v4
+        with:
+          name: triage-result
+          path: /tmp/
+
+      - name: Compile review prompt via Claude
+        id: compile
+        uses: anthropics/claude-code-action@v1
+        with:
+          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
+          prompt: |
+            Read your instructions from: .github/gstack-review/compile-instructions.md
+
+            Then read these input files:
+
+            ## Input 1: Review Skill
+            Read the file at: review/SKILL.md
+
+            ## Input 2: Engineering Review Skill
+            Read the file at: plan-eng-review/SKILL.md
+
+            ## Input 3: Triage Classification
+            ```json
+            ${{ needs.triage.outputs.triage_json }}
+            ```
+
+            ## Input 4: Output Schema
+            Read the file at: .github/gstack-review/review-schema.json
+
+            Now follow the compile-instructions.md to produce the compiled prompt.
+          model: claude-sonnet-4-20250514
+          claude_args: "--output-format=text"
+
+  # ===========================================================================
+  # Step 3: Deep Review — Claude reviews the actual PR diff
+  # ===========================================================================
+  review:
+    needs: [should-review, triage, compile-prompt]
+    runs-on: ubuntu-latest
+    outputs:
+      review_json: ${{ steps.review.outputs.result }}
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0  # full history for diff
+
+      - name: Get PR diff
+        id: diff
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          PR_NUMBER: ${{ needs.should-review.outputs.pr_number }}
+        run: |
+          # Fetch the diff
+          gh pr diff "$PR_NUMBER" > /tmp/pr-diff.txt
+
+          # Get diff stats
+          DIFF_LINES=$(wc -l < /tmp/pr-diff.txt)
+          echo "diff_lines=${DIFF_LINES}" >> "$GITHUB_OUTPUT"
+
+          # For very large diffs, truncate to keep within token budget
+          if [ "$DIFF_LINES" -gt 3000 ]; then
+            echo "::warning::Large diff (${DIFF_LINES} lines) — truncating to 3000 lines"
+            head -n 3000 /tmp/pr-diff.txt > /tmp/pr-diff-truncated.txt
+            echo "[... diff truncated from ${DIFF_LINES} lines to 3000 ...]" >> /tmp/pr-diff-truncated.txt
+            mv /tmp/pr-diff-truncated.txt /tmp/pr-diff.txt
+          fi
+
+      - name: Run Claude review
+        id: review
+        uses: anthropics/claude-code-action@v1
+        with:
+          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
+          prompt: |
+            You are performing an automated code review. Follow the system prompt below EXACTLY.
+            
+            ## System Instructions
+            ${{ needs.compile-prompt.outputs.compiled_prompt }}
+
+            ## PR Information
+            - PR #${{ needs.should-review.outputs.pr_number }}
+            - Type: ${{ needs.triage.outputs.pr_type }}
+            - Risk: ${{ needs.triage.outputs.risk_level }}
+            - Review Depth: ${{ needs.triage.outputs.review_depth }}
+
+            ## Task
+            1. Read the PR diff by running: cat /tmp/pr-diff.txt
+            2. Look at the actual source files for any findings that need more context
+            3. Produce your review as a SINGLE JSON object conforming to the schema
+
+            IMPORTANT: Your entire response must be ONLY the JSON object. No markdown fences.
+            No preamble. No explanation after the JSON. Just the raw JSON.
+
+            Set review_metadata.prompt_version to "v1.0"
+            Set review_metadata.pr_type to "${{ needs.triage.outputs.pr_type }}"
+            Set review_metadata.review_depth to "${{ needs.triage.outputs.review_depth }}"
+            Set review_metadata.triage_source to the _source field from the triage: ${{ needs.triage.outputs.triage_json }}
+          model: claude-sonnet-4-20250514
+
+      - name: Save review result
+        run: |
+          # Extract JSON from Claude's response (strip any accidental markdown)
+          RESPONSE='${{ steps.review.outputs.result }}'
+          echo "$RESPONSE" | sed 's/^```json//; s/^```//; s/```$//' | jq . > /tmp/review.json 2>/dev/null || {
+            echo "::error::Failed to parse review output as JSON"
+            echo "$RESPONSE" > /tmp/review-raw.txt
+            exit 1
+          }
+
+          echo "::group::Review Result"
+          jq . /tmp/review.json
+          echo "::endgroup::"
+
+      - name: Upload review artifact
+        uses: actions/upload-artifact@v4
+        with:
+          name: review-result
+          path: /tmp/review.json
+          retention-days: 90
+
+  # ===========================================================================
+  # Step 4: Route Action — Deterministic approve/reject/comment/merge
+  # ===========================================================================
+  route:
+    needs: [should-review, triage, review]
+    runs-on: ubuntu-latest
+    # Use a GitHub App token if available (needed for merge approval to count)
+    # Falls back to GITHUB_TOKEN for comment-only mode
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Download artifacts
+        uses: actions/download-artifact@v4
+        with:
+          path: /tmp/artifacts/
+
+      - name: Generate GitHub App token
+        id: app-token
+        if: ${{ vars.APP_ID != '' }}
+        uses: actions/create-github-app-token@v1
+        with:
+          app-id: ${{ vars.APP_ID }}
+          private-key: ${{ secrets.APP_PRIVATE_KEY }}
+
+      - name: Route review action
+        env:
+          GITHUB_TOKEN: ${{ steps.app-token.outputs.token || secrets.GITHUB_TOKEN }}
+          PR_NUMBER: ${{ needs.should-review.outputs.pr_number }}
+          AUTO_MERGE_ENABLED: ${{ vars.AUTO_MERGE_ENABLED || 'false' }}
+        run: |
+          chmod +x .github/gstack-review/route-action.sh
+          .github/gstack-review/route-action.sh \
+            /tmp/artifacts/review-result/review.json \
+            /tmp/artifacts/triage-result/triage.json
\ No newline at end of file
diff --git a/.github/workflows/skill-docs.yml b/.github/workflows/skill-docs.yml
index ebb6c808..6bfeebd3 100644
--- a/.github/workflows/skill-docs.yml
+++ b/.github/workflows/skill-docs.yml
@@ -9,7 +9,7 @@ jobs:
       - run: bun install
       - name: Check Claude host freshness
         run: bun run gen:skill-docs
-      - run: git diff --exit-code || (echo "Generated SKILL.md files are stale. Run: bun run gen:skill-docs" && exit 1)
+      - run: git diff --exit-code || (echo "Generated SKILL.md files are stale. Run 'bun run gen:skill-docs'" && exit 1)
       - name: Check Codex host freshness
         run: bun run gen:skill-docs --host codex
-      - run: git diff --exit-code -- .agents/ || (echo "Generated Codex SKILL.md files are stale. Run: bun run gen:skill-docs --host codex" && exit 1)
+      - run: git diff --exit-code -- .agents/ || (echo "Generated Codex SKILL.md files are stale. Run 'bun run gen:skill-docs --host codex'" && exit 1)