diff --git a/.github/gstack-review/compile-instructions.md b/.github/gstack-review/compile-instructions.md new file mode 100644 index 00000000..903a0959 --- /dev/null +++ b/.github/gstack-review/compile-instructions.md @@ -0,0 +1,109 @@ +# Prompt Compilation Instructions + +You are a **prompt compiler**. Your job is to read two gstack skill files and a +triage classification, then produce a single self-contained system prompt for a +headless CI code reviewer. + +## Inputs You Will Receive + +1. **`review/SKILL.md`** — gstack's interactive staff engineer review skill. + Contains the review philosophy, checklists, finding classifications, Greptile + integration, Codex integration, telemetry hooks, and interactive conversation + patterns. + +2. **`plan-eng-review/SKILL.md`** — gstack's engineering review skill. + Contains architecture heuristics, data flow analysis patterns, test review + methodology, failure mode thinking, and engineering principles. + +3. **Triage JSON** — The classification output from Step 1, containing: + `pr_type`, `risk_level`, `risk_areas`, `review_context`, `suggested_review_depth`, + `conversation_summary`, `needs_architecture_review`, `needs_security_review`, + `key_files`, and PR metadata. + +4. **Review output schema** — The JSON schema that the final review must conform to. + +## What to Extract from the Skill Files + +### From `review/SKILL.md`, extract and adapt: +- The **reviewer persona** and mindset (paranoid staff engineer, structural audit) +- The **review checklist categories** (what to look for in each dimension) +- The **finding severity classification** rules (critical, major, minor, nit) +- The **auto-fix vs flag** decision criteria (adapt to: flag everything, fix nothing — this is CI) +- Any **security-specific checks** mentioned (OWASP patterns, auth, injection, etc.) +- The **completeness audit** patterns (forgotten enum handlers, missing consumers, etc.) + +### From `plan-eng-review/SKILL.md`, extract and adapt: +- The **architecture heuristics** (boring by default, two-week smell test, etc.) +- The **data flow tracing** methodology +- The **state machine / state transition** analysis approach +- The **failure mode thinking** (what happens when dependencies are down) +- The **test review criteria** (systems over heroes, coverage philosophy) +- The **engineering principles** (error budgets, glue work awareness, etc.) + +### Ignore / strip out from both files: +- All `bash` preamble blocks (session management, telemetry, update checks) +- All `AskUserQuestion` / interactive conversation patterns +- All Greptile integration logic +- All Codex / OpenAI integration logic +- All `gstack-config` / `gstack-review-log` commands +- All proactive skill suggestion logic +- All references to `~/.gstack/` directories +- All `STOP` / `WAIT` / conversation flow control +- All telemetry event logging +- Browser / screenshot / QA related sections +- Version check / upgrade logic + +## How to Compile the Prompt + +### 1. Set the persona +Based on the triage `suggested_review_depth`: +- **`quick`**: Concise reviewer. Focus on correctness and obvious bugs only. + Skip deep architecture analysis. Use principles from `review/SKILL.md` only. +- **`standard`**: Full 5-dimension review. Use both skill files. +- **`deep`**: Thorough review with edge case analysis. Emphasize failure modes + and data flow tracing from `plan-eng-review/SKILL.md`. +- **`adversarial`**: Everything above plus attacker mindset. Add explicit + instructions to think like a malicious user, a chaos engineer, and a + tired on-call engineer at 3 AM. + +### 2. Emphasize relevant dimensions +Use the triage `risk_areas` to weight the review: +- If `security` is in risk_areas → expand the security checklist, add OWASP specifics +- If `database` → emphasize migration safety, query performance, data integrity +- If `api_contract` → focus on breaking changes, versioning, consumer impact +- If `performance` → add N+1 detection, pagination checks, resource leak patterns +- If `breaking_change` → require rollback analysis + +### 3. Handle re-review context +If `review_context` is `re_review` or `follow_up`: +- Include the `conversation_summary` from triage +- Instruct the reviewer to specifically check whether prior feedback was addressed +- Weight completeness dimension higher + +### 4. Scope the file focus +Use the triage `key_files` list to instruct the reviewer which files deserve +the closest attention, while still reviewing the full diff. + +### 5. Include architecture review conditionally +Only include the `plan-eng-review` architecture analysis section if +`needs_architecture_review` is `true` in the triage. + +### 6. Embed the output schema +Include the COMPLETE JSON schema in the compiled prompt so the reviewer +knows exactly what structure to produce. Remind it that: +- Output must be ONLY valid JSON, no markdown fences, no preamble +- Every finding needs file, line, severity, category, title, description +- The `suggested_fix` field should have concrete code when possible +- Scores are integers 0-10 +- Summary is 2-3 sentences, human-readable +- Confidence reflects certainty about the overall verdict + +## Output Format + +Your output must be ONLY the compiled system prompt text. No markdown fences +around it. No explanation. No preamble like "Here is the compiled prompt:". +Just the raw prompt text that will be fed directly to the reviewer model. + +The compiled prompt should be self-contained — it must not reference any +external files, URLs, or tools. Everything the reviewer needs must be +inline in the prompt. \ No newline at end of file diff --git a/.github/gstack-review/review-schema.json b/.github/gstack-review/review-schema.json new file mode 100644 index 00000000..56e3b503 --- /dev/null +++ b/.github/gstack-review/review-schema.json @@ -0,0 +1,141 @@ +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "title": "GStack PR Review Result", + "description": "Structured output from the AI code review step. This is the contract between Step 3 (Claude review) and Step 4 (action routing).", + "type": "object", + "required": ["verdict", "confidence", "scores", "overall_score", "findings", "summary", "review_metadata"], + "properties": { + "verdict": { + "type": "string", + "enum": ["approve", "request_changes", "comment_only"], + "description": "The review decision. 'approve' means the PR is ready to merge. 'request_changes' means issues must be addressed. 'comment_only' means feedback is provided but merge is not blocked." + }, + "confidence": { + "type": "number", + "minimum": 0, + "maximum": 1, + "description": "Confidence in the verdict. Below 0.7 triggers human-reviewer escalation." + }, + "scores": { + "type": "object", + "required": ["design", "security", "performance", "test_coverage", "completeness"], + "properties": { + "design": { + "type": "integer", "minimum": 0, "maximum": 10, + "description": "Architecture fit, abstraction quality, readability" + }, + "security": { + "type": "integer", "minimum": 0, "maximum": 10, + "description": "OWASP alignment, input validation, auth, secrets" + }, + "performance": { + "type": "integer", "minimum": 0, "maximum": 10, + "description": "Query efficiency, resource management, scalability" + }, + "test_coverage": { + "type": "integer", "minimum": 0, "maximum": 10, + "description": "New code paths tested, edge cases, regression tests" + }, + "completeness": { + "type": "integer", "minimum": 0, "maximum": 10, + "description": "Does the diff match the PR description? Missing pieces?" + } + }, + "additionalProperties": false + }, + "overall_score": { + "type": "number", + "minimum": 0, + "maximum": 10, + "description": "Weighted average. Security and completeness weigh more for high-risk PRs." + }, + "findings": { + "type": "array", + "items": { + "type": "object", + "required": ["severity", "category", "file", "line", "title", "description"], + "properties": { + "severity": { + "type": "string", + "enum": ["critical", "major", "minor", "nit"] + }, + "category": { + "type": "string", + "enum": ["design", "security", "performance", "test_coverage", "completeness", "correctness", "reliability"] + }, + "file": { + "type": "string", + "description": "Relative file path from repo root" + }, + "line": { + "type": "integer", + "minimum": 1, + "description": "Line number in the file (use the new file line numbers from the diff)" + }, + "title": { + "type": "string", + "maxLength": 120, + "description": "One-line summary of the finding" + }, + "description": { + "type": "string", + "maxLength": 1000, + "description": "Detailed explanation of the issue and its impact" + }, + "suggested_fix": { + "type": "string", + "description": "Concrete code suggestion or fix approach. Optional but strongly encouraged." + } + }, + "additionalProperties": false + } + }, + "summary": { + "type": "string", + "maxLength": 500, + "description": "2-3 sentence human-readable summary of the review outcome" + }, + "review_metadata": { + "type": "object", + "required": ["pr_type", "review_depth", "files_reviewed", "model_used", "prompt_version"], + "properties": { + "pr_type": { + "type": "string", + "description": "PR type from triage step" + }, + "review_depth": { + "type": "string", + "enum": ["quick", "standard", "deep", "adversarial"], + "description": "Review depth from triage step" + }, + "files_reviewed": { + "type": "integer", + "description": "Number of files included in the review" + }, + "model_used": { + "type": "string", + "description": "Claude model used for the review" + }, + "prompt_version": { + "type": "string", + "description": "Version of the prompt template used" + }, + "triage_model": { + "type": "string", + "description": "Model used for the triage classification step" + }, + "triage_source": { + "type": "string", + "enum": ["model", "heuristic", "heuristic_fallback"], + "description": "Whether triage used the HF model or fell back to heuristics" + }, + "duration_seconds": { + "type": "number", + "description": "Total review duration in seconds" + } + }, + "additionalProperties": false + } + }, + "additionalProperties": false +} \ No newline at end of file diff --git a/.github/gstack-review/route-action.sh b/.github/gstack-review/route-action.sh new file mode 100644 index 00000000..5161c543 --- /dev/null +++ b/.github/gstack-review/route-action.sh @@ -0,0 +1,267 @@ +#!/usr/bin/env bash +set -euo pipefail + +# --------------------------------------------------------------------------- +# Step 4 — Action Routing +# +# Reads the structured review JSON from Step 3 and takes action: +# - Approve / Request Changes / Comment +# - Label the PR +# - Optionally merge (for auto-mergeable low-risk PRs) +# - Post inline comments for findings +# --------------------------------------------------------------------------- + +REVIEW_JSON="${1:?Usage: route-action.sh }" +TRIAGE_JSON="${2:?Usage: route-action.sh }" + +REPO="${GITHUB_REPOSITORY:?GITHUB_REPOSITORY not set}" +PR_NUMBER="${PR_NUMBER:?PR_NUMBER not set}" +AUTO_MERGE_ENABLED="${AUTO_MERGE_ENABLED:-false}" + +# Parse review result +verdict=$(jq -r '.verdict' "$REVIEW_JSON") +confidence=$(jq -r '.confidence' "$REVIEW_JSON") +overall_score=$(jq -r '.overall_score' "$REVIEW_JSON") +summary=$(jq -r '.summary' "$REVIEW_JSON") +critical_count=$(jq '[.findings[] | select(.severity == "critical")] | length' "$REVIEW_JSON") +major_count=$(jq '[.findings[] | select(.severity == "major")] | length' "$REVIEW_JSON") +minor_count=$(jq '[.findings[] | select(.severity == "minor")] | length' "$REVIEW_JSON") +nit_count=$(jq '[.findings[] | select(.severity == "nit")] | length' "$REVIEW_JSON") +has_security_critical=$(jq '[.findings[] | select(.severity == "critical" and .category == "security")] | length' "$REVIEW_JSON") + +# Parse triage +pr_type=$(jq -r '.pr_type' "$TRIAGE_JSON") +auto_mergeable=$(jq -r '.auto_mergeable' "$TRIAGE_JSON") +risk_level=$(jq -r '.risk_level' "$TRIAGE_JSON") +review_depth=$(jq -r '.suggested_review_depth' "$TRIAGE_JSON") + +# Score table for the comment +design_score=$(jq -r '.scores.design' "$REVIEW_JSON") +security_score=$(jq -r '.scores.security' "$REVIEW_JSON") +performance_score=$(jq -r '.scores.performance' "$REVIEW_JSON") +test_score=$(jq -r '.scores.test_coverage' "$REVIEW_JSON") +completeness_score=$(jq -r '.scores.completeness' "$REVIEW_JSON") +review_model=$(jq -r '.review_metadata.model_used' "$REVIEW_JSON") +triage_source=$(jq -r '.review_metadata.triage_source // "unknown"' "$REVIEW_JSON") + +# --------------------------------------------------------------------------- +# Build the review comment body +# --------------------------------------------------------------------------- +build_comment() { + cat < +Review metadata + +- Model: ${review_model} +- Triage: ${triage_source} +- Prompt version: $(jq -r '.review_metadata.prompt_version' "$REVIEW_JSON") + + + +--- +*Automated review by [gstack-pr-pipeline](https://github.com/garrytan/gstack) • Scores are AI-generated and should be verified by a human reviewer* +EOF +} + +# --------------------------------------------------------------------------- +# Post inline review comments for findings +# --------------------------------------------------------------------------- +post_inline_comments() { + local event="COMMENT" + local comments_json="[]" + + # Build review comments array for findings that have file and line + comments_json=$(jq -c '[ + .findings[] + | select(.file != "" and .line > 0) + | { + path: .file, + line: .line, + body: ("**[\(.severity | ascii_upcase)]** \(.title)\n\n\(.description)" + + (if .suggested_fix then "\n\n💡 **Suggested fix:**\n```\n\(.suggested_fix)\n```" else "" end)) + } + ]' "$REVIEW_JSON") + + local num_comments + num_comments=$(echo "$comments_json" | jq 'length') + + if [ "$num_comments" -eq 0 ]; then + echo "::notice::No inline comments to post" + return + fi + + echo "::notice::Posting ${num_comments} inline review comments" + + # Determine review event type based on verdict + case "$verdict" in + approve) event="APPROVE" ;; + request_changes) event="REQUEST_CHANGES" ;; + *) event="COMMENT" ;; + esac + + # Post as a pull request review with inline comments + local review_body + review_body=$(build_comment) + + local payload + payload=$(jq -n \ + --arg body "$review_body" \ + --arg event "$event" \ + --argjson comments "$comments_json" \ + --arg commit "$(gh pr view "$PR_NUMBER" --repo "$REPO" --json headRefOid -q '.headRefOid')" \ + '{ + body: $body, + event: $event, + commit_id: $commit, + comments: $comments + }') + + gh api \ + --method POST \ + "/repos/${REPO}/pulls/${PR_NUMBER}/reviews" \ + --input - <<< "$payload" +} + +# --------------------------------------------------------------------------- +# Label management +# --------------------------------------------------------------------------- +add_label() { + local label="$1" + gh pr edit "$PR_NUMBER" --repo "$REPO" --add-label "$label" 2>/dev/null || \ + echo "::warning::Could not add label '${label}' — it may not exist. Create it in repo settings." +} + +remove_label() { + local label="$1" + gh pr edit "$PR_NUMBER" --repo "$REPO" --remove-label "$label" 2>/dev/null || true +} + +ensure_labels_exist() { + local labels=("ai-approved" "ai-review-passed" "needs-work" "needs-human-review" "security-review-needed" "auto-merge-candidate") + for label in "${labels[@]}"; do + gh label create "$label" --repo "$REPO" --force --description "Auto-managed by gstack-pr-pipeline" 2>/dev/null || true + done +} + +# --------------------------------------------------------------------------- +# Decision logic +# --------------------------------------------------------------------------- +echo "::group::Review Decision" +echo "Verdict: ${verdict}" +echo "Confidence: ${confidence}" +echo "Overall Score: ${overall_score}" +echo "Critical: ${critical_count}, Major: ${major_count}" +echo "PR Type: ${pr_type}, Risk: ${risk_level}" +echo "Auto-mergeable (triage): ${auto_mergeable}" +echo "Auto-merge enabled (repo): ${AUTO_MERGE_ENABLED}" +echo "::endgroup::" + +# Ensure labels exist +ensure_labels_exist + +# Clean up any stale labels from previous runs +remove_label "ai-approved" +remove_label "ai-review-passed" +remove_label "needs-work" +remove_label "needs-human-review" +remove_label "security-review-needed" +remove_label "auto-merge-candidate" + +# --- Route 1: Security escalation (always, regardless of verdict) --- +if [ "$has_security_critical" -gt 0 ]; then + echo "::warning::Critical security finding detected — escalating" + add_label "security-review-needed" + # Don't auto-merge, even if everything else looks fine + auto_mergeable="false" +fi + +# --- Route 2: Low confidence — escalate to human --- +confidence_threshold=$(echo "$confidence" | awk '{print ($1 < 0.7) ? "low" : "ok"}') +if [ "$confidence_threshold" = "low" ]; then + echo "::notice::Low confidence (${confidence}) — posting comment only, requesting human review" + post_inline_comments + add_label "needs-human-review" + echo "action=comment_only" >> "$GITHUB_OUTPUT" + exit 0 +fi + +# --- Route 3: Approve + auto-merge (highest confidence, lowest risk) --- +if [ "$verdict" = "approve" ] && \ + [ "$critical_count" -eq 0 ] && \ + [ "$major_count" -eq 0 ] && \ + [ "$auto_mergeable" = "true" ] && \ + [ "$(echo "$overall_score >= 9" | bc -l)" -eq 1 ]; then + echo "::notice::Auto-merge eligible: score ${overall_score}, no critical/major findings, triage approved" + post_inline_comments + add_label "ai-approved" + + # Only actually merge if the repo-level toggle is enabled + if [ "$AUTO_MERGE_ENABLED" = "true" ]; then + add_label "auto-merge-candidate" + # Enable auto-merge (squash) — GitHub will merge once all other checks pass + gh pr merge "$PR_NUMBER" --repo "$REPO" --squash --auto 2>/dev/null && \ + echo "::notice::Auto-merge enabled for PR #${PR_NUMBER}" || \ + echo "::warning::Could not enable auto-merge — check branch protection settings" + echo "action=auto_merge" >> "$GITHUB_OUTPUT" + else + echo "::notice::Auto-merge is disabled (AUTO_MERGE_ENABLED=$AUTO_MERGE_ENABLED). PR approved but merge is manual." + add_label "ai-review-passed" + echo "action=approve" >> "$GITHUB_OUTPUT" + fi + exit 0 +fi + +# --- Route 4: Approve (good score, no blockers, but not auto-merge eligible) --- +if [ "$verdict" = "approve" ] && \ + [ "$critical_count" -eq 0 ] && \ + [ "$(echo "$overall_score >= 7" | bc -l)" -eq 1 ]; then + echo "::notice::Approved: score ${overall_score}, no critical findings" + post_inline_comments + add_label "ai-review-passed" + echo "action=approve" >> "$GITHUB_OUTPUT" + exit 0 +fi + +# --- Route 5: Comment only (moderate issues, non-blocking) --- +if [ "$verdict" = "comment_only" ] || \ + ([ "$critical_count" -eq 0 ] && [ "$major_count" -le 2 ]); then + echo "::notice::Comment-only review: flagging ${major_count} major, ${minor_count} minor findings" + post_inline_comments + add_label "needs-human-review" + echo "action=comment_only" >> "$GITHUB_OUTPUT" + exit 0 +fi + +# --- Route 6: Request changes (default for anything with critical/major findings) --- +echo "::notice::Requesting changes: ${critical_count} critical, ${major_count} major findings" +post_inline_comments +add_label "needs-work" +echo "action=request_changes" >> "$GITHUB_OUTPUT" \ No newline at end of file diff --git a/.github/gstack-review/triage.py b/.github/gstack-review/triage.py new file mode 100644 index 00000000..397b9b25 --- /dev/null +++ b/.github/gstack-review/triage.py @@ -0,0 +1,498 @@ +#!/usr/bin/env python3 +""" +Step 1 — PR Triage via HuggingFace Inference API (Qwen2.5-3B-Instruct) + +Classifies a PR by type, risk, and review depth needed. +Inputs: PR metadata, diff, review comments, conversation, linked issues. +Output: triage JSON to stdout. +""" + +import json +import os +import sys +import urllib.request +import urllib.error + +# --------------------------------------------------------------------------- +# Config +# --------------------------------------------------------------------------- +HF_MODEL = os.getenv("HF_TRIAGE_MODEL", "Qwen/Qwen2.5-3B-Instruct") +HF_TOKEN = os.getenv("HF_TOKEN", "") +GH_TOKEN = os.getenv("GITHUB_TOKEN", "") +REPO = os.getenv("GITHUB_REPOSITORY", "") # owner/repo +PR_NUMBER = os.getenv("PR_NUMBER", "") +MAX_DIFF_CHARS = 12_000 # keep diff under token budget for a 3B model +MAX_COMMENT_CHARS = 4_000 +MAX_ISSUE_CHARS = 2_000 + +# --------------------------------------------------------------------------- +# GitHub API helpers +# --------------------------------------------------------------------------- +def gh_api(path: str) -> dict | list | str: + """GET from GitHub REST API v3.""" + url = f"https://api.github.com{path}" + req = urllib.request.Request(url, headers={ + "Authorization": f"Bearer {GH_TOKEN}", + "Accept": "application/vnd.github.v3+json", + "X-GitHub-Api-Version": "2022-11-28", + }) + try: + with urllib.request.urlopen(req, timeout=30) as resp: + return json.loads(resp.read().decode()) + except urllib.error.HTTPError as e: + print(f"::warning::GitHub API error for {path}: {e.code}", file=sys.stderr) + return {} if "pulls" in path else [] + + +def gh_api_raw(path: str) -> str: + """GET raw diff from GitHub API.""" + url = f"https://api.github.com{path}" + req = urllib.request.Request(url, headers={ + "Authorization": f"Bearer {GH_TOKEN}", + "Accept": "application/vnd.github.v3.diff", + "X-GitHub-Api-Version": "2022-11-28", + }) + try: + with urllib.request.urlopen(req, timeout=60) as resp: + return resp.read().decode(errors="replace") + except urllib.error.HTTPError as e: + print(f"::warning::GitHub API diff error: {e.code}", file=sys.stderr) + return "" + + +# --------------------------------------------------------------------------- +# Gather PR context +# --------------------------------------------------------------------------- +def gather_context() -> dict: + """Collect everything the triage model needs.""" + pr = gh_api(f"/repos/{REPO}/pulls/{PR_NUMBER}") + if not pr: + sys.exit("ERROR: could not fetch PR metadata") + + # Basic metadata + ctx = { + "title": pr.get("title", ""), + "body": (pr.get("body") or "")[:2000], + "author": pr.get("user", {}).get("login", "unknown"), + "base_branch": pr.get("base", {}).get("ref", "main"), + "head_branch": pr.get("head", {}).get("ref", ""), + "labels": [l["name"] for l in pr.get("labels", [])], + "draft": pr.get("draft", False), + "additions": pr.get("additions", 0), + "deletions": pr.get("deletions", 0), + "changed_files_count": pr.get("changed_files", 0), + } + + # Changed files list + files = gh_api(f"/repos/{REPO}/pulls/{PR_NUMBER}/files") + ctx["changed_files"] = [ + {"name": f["filename"], "status": f["status"], "additions": f["additions"], "deletions": f["deletions"]} + for f in (files if isinstance(files, list) else []) + ][:50] # cap at 50 files + + # Diff (truncated) + diff = gh_api_raw(f"/repos/{REPO}/pulls/{PR_NUMBER}") + ctx["diff_truncated"] = diff[:MAX_DIFF_CHARS] + ctx["diff_total_chars"] = len(diff) + + # Review comments (inline review threads) + review_comments = gh_api(f"/repos/{REPO}/pulls/{PR_NUMBER}/comments") + if isinstance(review_comments, list): + ctx["review_comments"] = [ + {"user": c.get("user", {}).get("login", ""), "body": (c.get("body") or "")[:500], "path": c.get("path", "")} + for c in review_comments + ][:20] + else: + ctx["review_comments"] = [] + + # Issue/PR conversation comments + issue_comments = gh_api(f"/repos/{REPO}/issues/{PR_NUMBER}/comments") + if isinstance(issue_comments, list): + ctx["conversation"] = [ + {"user": c.get("user", {}).get("login", ""), "body": (c.get("body") or "")[:500]} + for c in issue_comments + ][:20] + else: + ctx["conversation"] = [] + + # Linked issues (parse from PR body — GitHub doesn't have a direct API for this) + ctx["linked_issues"] = extract_linked_issues(ctx["body"]) + + # Fetch linked issue details + linked_issue_details = [] + for issue_num in ctx["linked_issues"][:5]: # cap at 5 + issue = gh_api(f"/repos/{REPO}/issues/{issue_num}") + if isinstance(issue, dict) and issue.get("title"): + linked_issue_details.append({ + "number": issue_num, + "title": issue.get("title", ""), + "body": (issue.get("body") or "")[:500], + "labels": [l["name"] for l in issue.get("labels", [])], + }) + ctx["linked_issue_details"] = linked_issue_details + + return ctx + + +def extract_linked_issues(body: str) -> list[int]: + """Extract issue numbers from common linking patterns in PR body.""" + import re + patterns = [ + r"(?:close[sd]?|fix(?:e[sd])?|resolve[sd]?)\s+#(\d+)", + r"(?:close[sd]?|fix(?:e[sd])?|resolve[sd]?)\s+https?://github\.com/[^/]+/[^/]+/issues/(\d+)", + r"#(\d+)", # generic issue references + ] + issues = [] + for pattern in patterns: + for match in re.finditer(pattern, body, re.IGNORECASE): + num = int(match.group(1)) + if num not in issues and num != int(PR_NUMBER): + issues.append(num) + return issues[:10] + + +# --------------------------------------------------------------------------- +# HuggingFace Inference API call +# --------------------------------------------------------------------------- +def call_hf_model(prompt: str) -> str: + """Call HuggingFace Inference API with the triage prompt.""" + url = f"https://router.huggingface.co/novita/v3/openai/chat/completions" + + payload = json.dumps({ + "model": HF_MODEL, + "messages": [ + {"role": "system", "content": SYSTEM_PROMPT}, + {"role": "user", "content": prompt}, + ], + "max_tokens": 1024, + "temperature": 0.1, # near-deterministic for classification + }).encode() + + headers = { + "Content-Type": "application/json", + } + if HF_TOKEN: + headers["Authorization"] = f"Bearer {HF_TOKEN}" + + req = urllib.request.Request(url, data=payload, headers=headers, method="POST") + + try: + with urllib.request.urlopen(req, timeout=60) as resp: + result = json.loads(resp.read().decode()) + return result["choices"][0]["message"]["content"] + except urllib.error.HTTPError as e: + body = e.read().decode(errors="replace") + print(f"::error::HuggingFace API error {e.code}: {body}", file=sys.stderr) + sys.exit(1) + except Exception as e: + print(f"::error::HuggingFace API call failed: {e}", file=sys.stderr) + sys.exit(1) + + +# --------------------------------------------------------------------------- +# Prompts +# --------------------------------------------------------------------------- +SYSTEM_PROMPT = """You are a PR triage classifier. You analyze pull request metadata, diffs, +review comments, conversations, and linked issues to produce a structured classification. + +You MUST respond with ONLY valid JSON — no markdown, no explanation, no preamble. + +JSON schema: +{ + "pr_type": "feature" | "bugfix" | "refactor" | "dependency" | "docs" | "config" | "test" | "hotfix", + "size": "trivial" | "small" | "medium" | "large" | "massive", + "risk_level": "low" | "medium" | "high" | "critical", + "risk_areas": ["security", "database", "api_contract", "auth", "payments", "data_loss", "performance", "breaking_change"], + "review_context": "fresh" | "re_review" | "follow_up" | "draft", + "conversation_summary": "One sentence summarizing review conversation so far, or empty string if none", + "needs_architecture_review": true | false, + "needs_security_review": true | false, + "auto_mergeable": true | false, + "suggested_review_depth": "quick" | "standard" | "deep" | "adversarial", + "key_files": ["list of most important changed files to focus review on"], + "reasoning": "Brief explanation of classification decisions" +} + +Classification rules: +- trivial: <=10 lines, docs/config only +- small: <=50 lines changed +- medium: 51-300 lines +- large: 301-1000 lines +- massive: >1000 lines +- auto_mergeable: ONLY if docs/deps/config, no logic changes, trivial size, low risk, no outstanding review comments requesting changes +- needs_architecture_review: true if PR adds new modules, changes data models, modifies API contracts, or restructures code +- needs_security_review: true if PR touches auth, crypto, user input handling, SQL/DB queries, secrets, or payment logic +- risk_level: critical if touching auth/payments/data-loss-paths, high if API changes or DB migrations, medium for feature code, low for docs/config/tests +- suggested_review_depth: quick for trivial/low-risk, standard for most, deep for large or high-risk, adversarial for critical risk +- If review comments show unresolved concerns, set review_context to "re_review" and summarize what was requested +- key_files: pick the 3-5 most important files from the diff that deserve the closest review attention""" + + +def build_user_prompt(ctx: dict) -> str: + """Build the user prompt from collected PR context.""" + sections = [] + + # PR metadata + sections.append(f"""## PR Metadata +- Title: {ctx['title']} +- Author: {ctx['author']} +- Base: {ctx['base_branch']} ← Head: {ctx['head_branch']} +- Labels: {', '.join(ctx['labels']) or 'none'} +- Draft: {ctx['draft']} +- Stats: +{ctx['additions']} -{ctx['deletions']} across {ctx['changed_files_count']} files""") + + # PR description + if ctx['body']: + sections.append(f"## PR Description\n{ctx['body'][:1500]}") + + # Changed files + if ctx['changed_files']: + file_list = "\n".join( + f" - {f['name']} ({f['status']}, +{f['additions']}/-{f['deletions']})" + for f in ctx['changed_files'][:30] + ) + sections.append(f"## Changed Files\n{file_list}") + + # Diff excerpt + if ctx['diff_truncated']: + sections.append(f"## Diff (first {MAX_DIFF_CHARS} chars of {ctx['diff_total_chars']} total)\n```diff\n{ctx['diff_truncated']}\n```") + + # Review comments + if ctx['review_comments']: + comments = "\n".join( + f" - @{c['user']} on `{c['path']}`: {c['body'][:300]}" + for c in ctx['review_comments'] + ) + sections.append(f"## Inline Review Comments\n{comments}") + + # Conversation + if ctx['conversation']: + convo = "\n".join( + f" - @{c['user']}: {c['body'][:300]}" + for c in ctx['conversation'] + ) + sections.append(f"## PR Conversation\n{convo}") + + # Linked issues + if ctx['linked_issue_details']: + issues = "\n".join( + f" - #{i['number']}: {i['title']} (labels: {', '.join(i['labels']) or 'none'})\n {i['body'][:300]}" + for i in ctx['linked_issue_details'] + ) + sections.append(f"## Linked Issues\n{issues}") + + sections.append("\nClassify this PR. Respond with ONLY the JSON object.") + return "\n\n".join(sections) + + +# --------------------------------------------------------------------------- +# Fallback classifier (if HF API fails or token not set) +# --------------------------------------------------------------------------- +def heuristic_fallback(ctx: dict) -> dict: + """Rule-based fallback triage when HF model is unavailable.""" + total_changes = ctx["additions"] + ctx["deletions"] + + # Size classification + if total_changes <= 10: + size = "trivial" + elif total_changes <= 50: + size = "small" + elif total_changes <= 300: + size = "medium" + elif total_changes <= 1000: + size = "large" + else: + size = "massive" + + # File-based heuristics + file_names = [f["name"].lower() for f in ctx.get("changed_files", [])] + all_files_str = " ".join(file_names) + + is_docs_only = all( + f.endswith((".md", ".txt", ".rst", ".adoc", ".mdx")) + for f in file_names + ) if file_names else False + + is_deps_only = all( + any(dep in f for dep in ["package.json", "requirements", "gemfile", "cargo.toml", "go.sum", "go.mod", "pom.xml", "build.gradle", ".lock", "yarn.lock", "bun.lock"]) + for f in file_names + ) if file_names else False + + is_config_only = all( + any(cfg in f for cfg in [".yml", ".yaml", ".toml", ".ini", ".env", ".config", "dockerfile", ".dockerignore", ".gitignore"]) + for f in file_names + ) if file_names else False + + is_test_only = all( + any(t in f for t in ["test", "spec", "__tests__", "_test."]) + for f in file_names + ) if file_names else False + + # PR type + title_lower = (ctx.get("title") or "").lower() + if is_docs_only: + pr_type = "docs" + elif is_deps_only: + pr_type = "dependency" + elif is_config_only: + pr_type = "config" + elif is_test_only: + pr_type = "test" + elif any(w in title_lower for w in ["fix", "bug", "patch", "hotfix"]): + pr_type = "hotfix" if "hotfix" in title_lower else "bugfix" + elif any(w in title_lower for w in ["refactor", "cleanup", "rename"]): + pr_type = "refactor" + else: + pr_type = "feature" + + # Risk areas + risk_areas = [] + security_keywords = ["auth", "jwt", "token", "password", "secret", "crypt", "oauth", "session", "cookie", "cors", "csrf"] + db_keywords = ["migration", "schema", "model", "query", "sql", "database", "prisma", "typeorm", "sequelize", "knex"] + api_keywords = ["route", "endpoint", "controller", "handler", "api", "graphql", "grpc"] + payment_keywords = ["payment", "stripe", "billing", "invoice", "subscription", "charge"] + + if any(k in all_files_str for k in security_keywords): + risk_areas.append("security") + if any(k in all_files_str for k in db_keywords): + risk_areas.append("database") + if any(k in all_files_str for k in api_keywords): + risk_areas.append("api_contract") + if any(k in all_files_str for k in payment_keywords): + risk_areas.append("payments") + + # Risk level + if "payments" in risk_areas or "security" in risk_areas: + risk_level = "critical" if size in ("large", "massive") else "high" + elif "database" in risk_areas or "api_contract" in risk_areas: + risk_level = "high" + elif pr_type in ("docs", "config", "test", "dependency"): + risk_level = "low" + elif size in ("large", "massive"): + risk_level = "high" + else: + risk_level = "medium" + + # Review context + has_review_comments = bool(ctx.get("review_comments")) + has_change_requests = any( + any(w in (c.get("body") or "").lower() for w in ["please", "should", "fix", "change", "update", "wrong", "incorrect"]) + for c in ctx.get("review_comments", []) + ) + if ctx.get("draft"): + review_context = "draft" + elif has_change_requests: + review_context = "re_review" + elif has_review_comments: + review_context = "follow_up" + else: + review_context = "fresh" + + # Auto-merge eligibility + auto_mergeable = ( + pr_type in ("docs", "dependency", "config") + and size in ("trivial", "small") + and risk_level == "low" + and not has_change_requests + and not ctx.get("draft") + ) + + # Review depth + if risk_level == "critical": + depth = "adversarial" + elif risk_level == "high" or size in ("large", "massive"): + depth = "deep" + elif risk_level == "low" and size in ("trivial", "small"): + depth = "quick" + else: + depth = "standard" + + # Key files (largest changes first) + key_files = sorted( + ctx.get("changed_files", []), + key=lambda f: f["additions"] + f["deletions"], + reverse=True, + )[:5] + + # Conversation summary + convo_summary = "" + if has_change_requests: + last_review = ctx.get("review_comments", [])[-1] if ctx.get("review_comments") else {} + convo_summary = f"@{last_review.get('user', 'reviewer')} requested changes on {last_review.get('path', 'unknown file')}" + + return { + "pr_type": pr_type, + "size": size, + "risk_level": risk_level, + "risk_areas": risk_areas, + "review_context": review_context, + "conversation_summary": convo_summary, + "needs_architecture_review": pr_type == "feature" and size in ("large", "massive"), + "needs_security_review": bool(set(risk_areas) & {"security", "payments", "auth"}), + "auto_mergeable": auto_mergeable, + "suggested_review_depth": depth, + "key_files": [f["name"] for f in key_files], + "reasoning": f"Heuristic fallback: {pr_type} PR, {size} size, {risk_level} risk. Touched: {', '.join(risk_areas) or 'no high-risk areas'}.", + } + + +# --------------------------------------------------------------------------- +# Main +# --------------------------------------------------------------------------- +def main(): + if not REPO or not PR_NUMBER: + sys.exit("ERROR: GITHUB_REPOSITORY and PR_NUMBER must be set") + + print(f"::group::Gathering PR context for {REPO}#{PR_NUMBER}", file=sys.stderr) + ctx = gather_context() + print(f"::endgroup::", file=sys.stderr) + + # Try HuggingFace model first, fall back to heuristics + if HF_TOKEN: + print(f"::group::Calling {HF_MODEL} for triage", file=sys.stderr) + prompt = build_user_prompt(ctx) + raw_response = call_hf_model(prompt) + print(f"::endgroup::", file=sys.stderr) + + # Parse JSON from model response + try: + # Strip markdown fences if present + cleaned = raw_response.strip() + if cleaned.startswith("```"): + cleaned = cleaned.split("\n", 1)[1] if "\n" in cleaned else cleaned + if cleaned.endswith("```"): + cleaned = cleaned[:-3] + cleaned = cleaned.strip() + triage = json.loads(cleaned) + triage["_source"] = "model" + triage["_model"] = HF_MODEL + except json.JSONDecodeError: + print(f"::warning::Model returned invalid JSON, falling back to heuristics. Raw: {raw_response[:500]}", file=sys.stderr) + triage = heuristic_fallback(ctx) + triage["_source"] = "heuristic_fallback" + else: + print("::notice::No HF_TOKEN set, using heuristic triage", file=sys.stderr) + triage = heuristic_fallback(ctx) + triage["_source"] = "heuristic" + + # Inject PR metadata for downstream steps + triage["_pr"] = { + "number": int(PR_NUMBER), + "title": ctx["title"], + "author": ctx["author"], + "base_branch": ctx["base_branch"], + "head_branch": ctx["head_branch"], + "additions": ctx["additions"], + "deletions": ctx["deletions"], + "changed_files_count": ctx["changed_files_count"], + "has_linked_issues": bool(ctx.get("linked_issue_details")), + "has_review_comments": bool(ctx.get("review_comments")), + "has_conversation": bool(ctx.get("conversation")), + } + + # Output + print(json.dumps(triage, indent=2)) + + +if __name__ == "__main__": + main() \ No newline at end of file diff --git a/.github/workflows/gstack-pr-review.yml b/.github/workflows/gstack-pr-review.yml new file mode 100644 index 00000000..76969c92 --- /dev/null +++ b/.github/workflows/gstack-pr-review.yml @@ -0,0 +1,319 @@ +# ============================================================================= +# gstack PR Review Pipeline +# +# A multi-step AI-powered PR review workflow inspired by gstack's review skills. +# +# Step 1: Triage — HuggingFace Qwen2.5-3B classifies the PR +# Step 2: Compile — Claude Sonnet compiles a tailored review prompt +# Step 3: Review — Claude reviews the PR and produces structured JSON +# Step 4: Route — Deterministic script approves/rejects/comments/merges +# +# Required secrets: +# ANTHROPIC_API_KEY — For Claude Code Action (Steps 2 & 3) +# HF_TOKEN — For HuggingFace Inference API (Step 1) +# +# Optional secrets: +# APP_ID + APP_PRIVATE_KEY — GitHub App for PR approval (recommended) +# +# Repository variables (Settings → Variables → Actions): +# AUTO_MERGE_ENABLED — Set to "true" to allow auto-merge. Default: "false" +# +# Required labels (auto-created by the workflow): +# ai-approved, ai-review-passed, needs-work, needs-human-review, +# security-review-needed, auto-merge-candidate +# ============================================================================= + +name: "gstack PR Review" + +on: + pull_request: + types: [opened, synchronize, ready_for_review] + # Allow re-triggering via @gstack-review comment + issue_comment: + types: [created] + +# Cancel in-progress runs for the same PR when new commits are pushed +concurrency: + group: gstack-review-${{ github.event.pull_request.number || github.event.issue.number }} + cancel-in-progress: true + +permissions: + contents: read + pull-requests: write + issues: write + +jobs: + # =========================================================================== + # Gate: Should we run? + # =========================================================================== + should-review: + runs-on: ubuntu-latest + outputs: + run: ${{ steps.check.outputs.run }} + pr_number: ${{ steps.check.outputs.pr_number }} + steps: + - id: check + run: | + # PR events: always run (unless draft) + if [ "${{ github.event_name }}" = "pull_request" ]; then + if [ "${{ github.event.pull_request.draft }}" = "true" ]; then + echo "run=false" >> "$GITHUB_OUTPUT" + echo "::notice::Skipping draft PR" + else + echo "run=true" >> "$GITHUB_OUTPUT" + echo "pr_number=${{ github.event.pull_request.number }}" >> "$GITHUB_OUTPUT" + fi + fi + + # Comment events: only on @gstack-review trigger + if [ "${{ github.event_name }}" = "issue_comment" ]; then + COMMENT_BODY="${{ github.event.comment.body }}" + if echo "$COMMENT_BODY" | grep -qi "@gstack-review"; then + # Verify the comment is on a PR, not a plain issue + if [ -n "${{ github.event.issue.pull_request }}" ]; then + echo "run=true" >> "$GITHUB_OUTPUT" + echo "pr_number=${{ github.event.issue.number }}" >> "$GITHUB_OUTPUT" + else + echo "run=false" >> "$GITHUB_OUTPUT" + fi + else + echo "run=false" >> "$GITHUB_OUTPUT" + fi + fi + + # =========================================================================== + # Step 1: Triage — Classify the PR using Qwen2.5-3B + # =========================================================================== + triage: + needs: should-review + if: needs.should-review.outputs.run == 'true' + runs-on: ubuntu-latest + outputs: + triage_json: ${{ steps.run-triage.outputs.triage }} + pr_type: ${{ steps.parse.outputs.pr_type }} + review_depth: ${{ steps.parse.outputs.review_depth }} + auto_mergeable: ${{ steps.parse.outputs.auto_mergeable }} + risk_level: ${{ steps.parse.outputs.risk_level }} + steps: + - uses: actions/checkout@v4 + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: "3.12" + + - name: Run triage classifier + id: run-triage + env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + HF_TOKEN: ${{ secrets.HF_TOKEN }} + PR_NUMBER: ${{ needs.should-review.outputs.pr_number }} + HF_TRIAGE_MODEL: "Qwen/Qwen2.5-3B-Instruct" + run: | + TRIAGE_OUTPUT=$(python .github/gstack-review/triage.py) + echo "$TRIAGE_OUTPUT" > /tmp/triage.json + echo "triage<> "$GITHUB_OUTPUT" + echo "$TRIAGE_OUTPUT" >> "$GITHUB_OUTPUT" + echo "TRIAGE_EOF" >> "$GITHUB_OUTPUT" + + # Log for debugging + echo "::group::Triage Result" + echo "$TRIAGE_OUTPUT" | jq . + echo "::endgroup::" + + - name: Parse triage outputs + id: parse + run: | + echo "pr_type=$(echo '${{ steps.run-triage.outputs.triage }}' | jq -r '.pr_type')" >> "$GITHUB_OUTPUT" + echo "review_depth=$(echo '${{ steps.run-triage.outputs.triage }}' | jq -r '.suggested_review_depth')" >> "$GITHUB_OUTPUT" + echo "auto_mergeable=$(echo '${{ steps.run-triage.outputs.triage }}' | jq -r '.auto_mergeable')" >> "$GITHUB_OUTPUT" + echo "risk_level=$(echo '${{ steps.run-triage.outputs.triage }}' | jq -r '.risk_level')" >> "$GITHUB_OUTPUT" + + - name: Upload triage artifact + uses: actions/upload-artifact@v4 + with: + name: triage-result + path: /tmp/triage.json + retention-days: 30 + + # =========================================================================== + # Step 2: Prompt Compilation — Claude Sonnet tailors the review prompt + # =========================================================================== + compile-prompt: + needs: [should-review, triage] + runs-on: ubuntu-latest + outputs: + compiled_prompt: ${{ steps.compile.outputs.prompt }} + steps: + # Checkout main branch — always use the canonical skill files, + # never the PR branch (which might have modified them) + - uses: actions/checkout@v4 + with: + ref: main + + - name: Verify skill files exist + run: | + if [ ! -f "review/SKILL.md" ]; then + echo "::error::review/SKILL.md not found on main branch" + exit 1 + fi + if [ ! -f "plan-eng-review/SKILL.md" ]; then + echo "::error::plan-eng-review/SKILL.md not found on main branch" + exit 1 + fi + echo "::notice::Skill files found — review/SKILL.md ($(wc -l < review/SKILL.md) lines), plan-eng-review/SKILL.md ($(wc -l < plan-eng-review/SKILL.md) lines)" + + - name: Download triage artifact + uses: actions/download-artifact@v4 + with: + name: triage-result + path: /tmp/ + + - name: Compile review prompt via Claude + id: compile + uses: anthropics/claude-code-action@v1 + with: + anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} + prompt: | + Read your instructions from: .github/gstack-review/compile-instructions.md + + Then read these input files: + + ## Input 1: Review Skill + Read the file at: review/SKILL.md + + ## Input 2: Engineering Review Skill + Read the file at: plan-eng-review/SKILL.md + + ## Input 3: Triage Classification + ```json + ${{ needs.triage.outputs.triage_json }} + ``` + + ## Input 4: Output Schema + Read the file at: .github/gstack-review/review-schema.json + + Now follow the compile-instructions.md to produce the compiled prompt. + model: claude-sonnet-4-20250514 + claude_args: "--output-format=text" + + # =========================================================================== + # Step 3: Deep Review — Claude reviews the actual PR diff + # =========================================================================== + review: + needs: [should-review, triage, compile-prompt] + runs-on: ubuntu-latest + outputs: + review_json: ${{ steps.review.outputs.result }} + steps: + - uses: actions/checkout@v4 + with: + fetch-depth: 0 # full history for diff + + - name: Get PR diff + id: diff + env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + PR_NUMBER: ${{ needs.should-review.outputs.pr_number }} + run: | + # Fetch the diff + gh pr diff "$PR_NUMBER" > /tmp/pr-diff.txt + + # Get diff stats + DIFF_LINES=$(wc -l < /tmp/pr-diff.txt) + echo "diff_lines=${DIFF_LINES}" >> "$GITHUB_OUTPUT" + + # For very large diffs, truncate to keep within token budget + if [ "$DIFF_LINES" -gt 3000 ]; then + echo "::warning::Large diff (${DIFF_LINES} lines) — truncating to 3000 lines" + head -n 3000 /tmp/pr-diff.txt > /tmp/pr-diff-truncated.txt + echo "[... diff truncated from ${DIFF_LINES} lines to 3000 ...]" >> /tmp/pr-diff-truncated.txt + mv /tmp/pr-diff-truncated.txt /tmp/pr-diff.txt + fi + + - name: Run Claude review + id: review + uses: anthropics/claude-code-action@v1 + with: + anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} + prompt: | + You are performing an automated code review. Follow the system prompt below EXACTLY. + + ## System Instructions + ${{ needs.compile-prompt.outputs.compiled_prompt }} + + ## PR Information + - PR #${{ needs.should-review.outputs.pr_number }} + - Type: ${{ needs.triage.outputs.pr_type }} + - Risk: ${{ needs.triage.outputs.risk_level }} + - Review Depth: ${{ needs.triage.outputs.review_depth }} + + ## Task + 1. Read the PR diff by running: cat /tmp/pr-diff.txt + 2. Look at the actual source files for any findings that need more context + 3. Produce your review as a SINGLE JSON object conforming to the schema + + IMPORTANT: Your entire response must be ONLY the JSON object. No markdown fences. + No preamble. No explanation after the JSON. Just the raw JSON. + + Set review_metadata.prompt_version to "v1.0" + Set review_metadata.pr_type to "${{ needs.triage.outputs.pr_type }}" + Set review_metadata.review_depth to "${{ needs.triage.outputs.review_depth }}" + Set review_metadata.triage_source to the _source field from the triage: ${{ needs.triage.outputs.triage_json }} + model: claude-sonnet-4-20250514 + + - name: Save review result + run: | + # Extract JSON from Claude's response (strip any accidental markdown) + RESPONSE='${{ steps.review.outputs.result }}' + echo "$RESPONSE" | sed 's/^```json//; s/^```//; s/```$//' | jq . > /tmp/review.json 2>/dev/null || { + echo "::error::Failed to parse review output as JSON" + echo "$RESPONSE" > /tmp/review-raw.txt + exit 1 + } + + echo "::group::Review Result" + jq . /tmp/review.json + echo "::endgroup::" + + - name: Upload review artifact + uses: actions/upload-artifact@v4 + with: + name: review-result + path: /tmp/review.json + retention-days: 90 + + # =========================================================================== + # Step 4: Route Action — Deterministic approve/reject/comment/merge + # =========================================================================== + route: + needs: [should-review, triage, review] + runs-on: ubuntu-latest + # Use a GitHub App token if available (needed for merge approval to count) + # Falls back to GITHUB_TOKEN for comment-only mode + steps: + - uses: actions/checkout@v4 + + - name: Download artifacts + uses: actions/download-artifact@v4 + with: + path: /tmp/artifacts/ + + - name: Generate GitHub App token + id: app-token + if: ${{ vars.APP_ID != '' }} + uses: actions/create-github-app-token@v1 + with: + app-id: ${{ vars.APP_ID }} + private-key: ${{ secrets.APP_PRIVATE_KEY }} + + - name: Route review action + env: + GITHUB_TOKEN: ${{ steps.app-token.outputs.token || secrets.GITHUB_TOKEN }} + PR_NUMBER: ${{ needs.should-review.outputs.pr_number }} + AUTO_MERGE_ENABLED: ${{ vars.AUTO_MERGE_ENABLED || 'false' }} + run: | + chmod +x .github/gstack-review/route-action.sh + .github/gstack-review/route-action.sh \ + /tmp/artifacts/review-result/review.json \ + /tmp/artifacts/triage-result/triage.json \ No newline at end of file diff --git a/.github/workflows/skill-docs.yml b/.github/workflows/skill-docs.yml index ebb6c808..6bfeebd3 100644 --- a/.github/workflows/skill-docs.yml +++ b/.github/workflows/skill-docs.yml @@ -9,7 +9,7 @@ jobs: - run: bun install - name: Check Claude host freshness run: bun run gen:skill-docs - - run: git diff --exit-code || (echo "Generated SKILL.md files are stale. Run: bun run gen:skill-docs" && exit 1) + - run: git diff --exit-code || (echo "Generated SKILL.md files are stale. Run 'bun run gen:skill-docs'" && exit 1) - name: Check Codex host freshness run: bun run gen:skill-docs --host codex - - run: git diff --exit-code -- .agents/ || (echo "Generated Codex SKILL.md files are stale. Run: bun run gen:skill-docs --host codex" && exit 1) + - run: git diff --exit-code -- .agents/ || (echo "Generated Codex SKILL.md files are stale. Run 'bun run gen:skill-docs --host codex'" && exit 1)