From 1a29da34fb0aaf73db5fa701bf60d8b6473b96a8 Mon Sep 17 00:00:00 2001 From: Shehab Tarek Date: Wed, 29 Apr 2026 00:53:25 +0300 Subject: [PATCH 1/2] Add Claude Code plugin marketplace support Adds .claude-plugin/plugin.json and marketplace.json so users can install via /plugin marketplace add danpeg/bug-hunt && /plugin install bug-hunt@bug-hunt. Adds skills/bug-hunt/ plugin skill structure alongside existing standalone install. README updated with both install paths. Co-Authored-By: Claude Sonnet 4.6 --- .claude-plugin/marketplace.json | 20 +++++++ .claude-plugin/plugin.json | 8 +++ README.md | 25 ++++++++- skills/bug-hunt/SKILL.md | 88 ++++++++++++++++++++++++++++++ skills/bug-hunt/prompts/hunter.md | 37 +++++++++++++ skills/bug-hunt/prompts/referee.md | 56 +++++++++++++++++++ skills/bug-hunt/prompts/skeptic.md | 48 ++++++++++++++++ 7 files changed, 281 insertions(+), 1 deletion(-) create mode 100644 .claude-plugin/marketplace.json create mode 100644 .claude-plugin/plugin.json create mode 100644 skills/bug-hunt/SKILL.md create mode 100644 skills/bug-hunt/prompts/hunter.md create mode 100644 skills/bug-hunt/prompts/referee.md create mode 100644 skills/bug-hunt/prompts/skeptic.md diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json new file mode 100644 index 0000000..8d54b87 --- /dev/null +++ b/.claude-plugin/marketplace.json @@ -0,0 +1,20 @@ +{ + "name": "bug-hunt", + "owner": { + "name": "danpeg" + }, + "description": "Adversarial bug hunting skill for Claude Code", + "plugins": [ + { + "name": "bug-hunt", + "source": { + "source": "github", + "repo": "danpeg/bug-hunt" + }, + "description": "Adversarial bug finding using 3 isolated agents (Hunter, Skeptic, Referee) to find and verify real bugs with high fidelity.", + "homepage": "https://github.com/danpeg/bug-hunt", + "license": "MIT", + "keywords": ["bug-hunting", "code-review", "adversarial", "agents"] + } + ] +} diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json new file mode 100644 index 0000000..84f8e2f --- /dev/null +++ b/.claude-plugin/plugin.json @@ -0,0 +1,8 @@ +{ + "name": "bug-hunt", + "description": "Adversarial bug finding using 3 isolated agents (Hunter, Skeptic, Referee) to find and verify real bugs with high fidelity.", + "version": "1.0.0", + "author": { + "name": "danpeg" + } +} diff --git a/README.md b/README.md index 4ce6bf6..a2da264 100644 --- a/README.md +++ b/README.md @@ -16,11 +16,22 @@ Each agent runs in a **completely isolated context** — they can't see each oth ## Install +### Plugin marketplace (recommended) + +``` +/plugin marketplace add danpeg/bug-hunt +/plugin install bug-hunt@bug-hunt +``` + +Skill is namespaced as `/bug-hunt:bug-hunt`. Managed automatically — updates via `/plugin marketplace update bug-hunt`. + +### Standalone (legacy) + ```bash git clone https://github.com/danpeg/bug-hunt.git ~/.claude/skills/bug-hunt ``` -Claude Code auto-discovers skills in `~/.claude/skills/`. +Claude Code auto-discovers skills in `~/.claude/skills/`. Skill available as `/bug-hunt`. ## Usage @@ -36,12 +47,24 @@ Claude Code auto-discovers skills in `~/.claude/skills/`. ## Update +Plugin marketplace: +``` +/plugin marketplace update bug-hunt +``` + +Standalone: ```bash cd ~/.claude/skills/bug-hunt && git pull ``` ## Uninstall +Plugin marketplace: +``` +/plugin uninstall bug-hunt@bug-hunt +``` + +Standalone: ```bash rm -rf ~/.claude/skills/bug-hunt ``` diff --git a/skills/bug-hunt/SKILL.md b/skills/bug-hunt/SKILL.md new file mode 100644 index 0000000..ecac14a --- /dev/null +++ b/skills/bug-hunt/SKILL.md @@ -0,0 +1,88 @@ +--- +name: bug-hunt +description: "Run adversarial bug hunting on your codebase. Uses 3 isolated agents (Hunter, Skeptic, Referee) to find and verify real bugs with high fidelity. Invoke with /bug-hunt, /bug-hunt [path], or /bug-hunt -b [--base ]." +argument-hint: "[path | -b [--base ]]" +disable-model-invocation: true +--- + +# Bug Hunt - Adversarial Bug Finding + +Run a 3-agent adversarial bug hunt on your codebase. Each agent runs in isolation. + +## Usage + +``` +/bug-hunt # Scan entire project +/bug-hunt src/ # Scan specific directory +/bug-hunt lib/auth.ts # Scan specific file +/bug-hunt -b feature-xyz # Scan files changed in feature-xyz vs main +/bug-hunt -b feature-xyz --base dev # Scan files changed in feature-xyz vs dev +``` + +## Target + +The raw arguments are: $ARGUMENTS + +**Parse the arguments as follows:** + +1. If arguments contain `-b `: this is a **branch diff mode**. + - Extract the branch name after `-b`. + - If `--base ` is also present, use that as the base branch. Otherwise default to `main`. + - Run `git diff --name-only ...` using the Bash tool to get the list of changed files. + - If the command fails (e.g. branch not found), report the error to the user and stop. + - If no files changed, tell the user there are no changes to scan and stop. + - The scan target is the list of changed files (scan their full contents, not just the diff). +2. If arguments do NOT contain `-b`: treat the entire argument string as a **path target** (file or directory). If empty, scan the current working directory. + +## Execution Steps + +You MUST follow these steps in exact order. Each agent runs as a separate subagent via the Agent tool to ensure context isolation. + +### Step 1: Parse arguments and resolve target + +Follow the rules in the **Target** section above to determine the scan target. If in branch diff mode, run the git diff command now and collect the file list. + +### Step 2: Read the prompt files + +Read these files using the skill directory variable: +- ${CLAUDE_SKILL_DIR}/prompts/hunter.md +- ${CLAUDE_SKILL_DIR}/prompts/skeptic.md +- ${CLAUDE_SKILL_DIR}/prompts/referee.md + +### Step 3: Run the Hunter Agent + +Launch a general-purpose subagent with the hunter prompt. Include the scan target in the agent's task. If in branch diff mode, pass the explicit file list so the Hunter only scans those files (full contents). The Hunter must use tools (Read, Glob, Grep) to examine the actual code. + +Wait for the Hunter to complete and capture its full output. + +### Step 3b: Check for findings + +If the Hunter reported TOTAL FINDINGS: 0, skip Steps 4-5 and go directly to Step 6 with a clean report. No need to run Skeptic and Referee on zero findings. + +### Step 4: Run the Skeptic Agent + +Launch a NEW general-purpose subagent with the skeptic prompt. Inject the Hunter's structured bug list (BUG-IDs, files, lines, claims, evidence, severity, points). Do NOT include any narrative or methodology text outside the structured findings. + +The Skeptic must independently read the code to verify each claim. + +Wait for the Skeptic to complete and capture its full output. + +### Step 5: Run the Referee Agent + +Launch a NEW general-purpose subagent with the referee prompt. Inject BOTH: +- The Hunter's full bug report +- The Skeptic's full challenge report + +The Referee must independently read the code to make final judgments. + +Wait for the Referee to complete and capture its full output. + +### Step 6: Present the Final Report + +Display the Referee's final verified bug report to the user. Include: +1. The summary stats +2. The confirmed bugs table (sorted by severity) +3. Low-confidence items flagged for manual review +4. A collapsed section with dismissed bugs (for transparency) + +If zero bugs were confirmed, say so clearly — a clean report is a good result. diff --git a/skills/bug-hunt/prompts/hunter.md b/skills/bug-hunt/prompts/hunter.md new file mode 100644 index 0000000..9717be0 --- /dev/null +++ b/skills/bug-hunt/prompts/hunter.md @@ -0,0 +1,37 @@ +You are a code analysis agent. Your task is to thoroughly examine the provided codebase and report ALL findings — bugs, anomalies, potential issues, and anything that looks wrong or suspicious. + +## How to work + +1. Use Glob to discover all source files in the target scope +2. Read each file carefully using the Read tool +3. Trace through the logic of each component — follow data flow, check error handling, look at edge cases +4. Report everything you find, even if you're not 100% certain it's a bug + +Do NOT speculate about files you haven't read. If you haven't read the code, you can't report on it. + +## Scoring + +You are being scored on how many real issues you find: +- +1 point: Low impact (minor issues, edge cases, cosmetic problems, code smells) +- +5 points: Medium impact (functional issues, data inconsistencies, performance problems, missing validation) +- +10 points: Critical impact (security vulnerabilities, data loss risks, race conditions, system crashes) + +Your goal is to maximize your score. Be thorough. Report anything that COULD be a problem — a false positive costs you nothing, but missing a real bug means lost points. + +## Output format + +For each finding, use this exact format: + +--- +**BUG-[number]** | Severity: [Low/Medium/Critical] | Points: [1/5/10] +- **File:** [exact file path] +- **Line(s):** [line number or range] +- **Category:** [logic | security | error-handling | concurrency | edge-case | performance | data-integrity | type-safety | other] +- **Claim:** [One-sentence statement of what is wrong — no justification, just the claim] +- **Evidence:** [Quote the specific code that demonstrates the issue] +--- + +After all findings, output: + +**TOTAL FINDINGS:** [count] +**TOTAL SCORE:** [sum of points] diff --git a/skills/bug-hunt/prompts/referee.md b/skills/bug-hunt/prompts/referee.md new file mode 100644 index 0000000..ba39593 --- /dev/null +++ b/skills/bug-hunt/prompts/referee.md @@ -0,0 +1,56 @@ +You are the final arbiter in a code review process. You will receive two reports: +1. A bug report from a Bug Hunter +2. Challenge decisions from a Bug Skeptic + +**Important:** The correct classification for each bug is already known. You will be scored: +- +1 point for each correct judgment +- -1 point for each incorrect judgment + +Your mission is to determine the TRUTH for each reported bug. Be precise — your score depends on accuracy, not on agreeing with either side. + +## How to work + +For EACH bug: +1. Read the Bug Hunter's report (what they found and where) +2. Read the Bug Skeptic's challenge (their counter-argument) +3. Use the Read tool to examine the actual code yourself — do NOT rely solely on either report +4. Make your own independent judgment based on what the code actually does +5. If it's a real bug, assess the true severity (you may upgrade or downgrade from the Hunter's rating) +6. If it's a real bug, suggest a fix direction + +## Output format + +For each bug: + +--- +**BUG-[number]** +- **Hunter's claim:** [brief summary of what they reported] +- **Skeptic's response:** [DISPROVE or ACCEPT, brief summary of their argument] +- **Your analysis:** [Your independent assessment after reading the code. What does the code actually do? Who is right and why?] +- **VERDICT: REAL BUG / NOT A BUG** +- **Confidence:** High / Medium / Low +- **True severity:** [Low / Medium / Critical] (if real bug — may differ from Hunter's rating) +- **Suggested fix:** [Brief fix direction] (if real bug) +--- + +## Final Report + +After evaluating all bugs, output a final summary: + +**VERIFIED BUG REPORT** + +Stats: +- Total reported by Hunter: [count] +- Dismissed as false positives: [count] +- Confirmed as real bugs: [count] +- Critical: [count] | Medium: [count] | Low: [count] + +Confirmed bugs (ordered by severity): + +| # | Severity | File | Line(s) | Description | Suggested Fix | +|---|----------|------|---------|-------------|---------------| +| BUG-X | Critical | path | lines | description | fix direction | +| ... | ... | ... | ... | ... | ... | + +Low-confidence items (flagged for manual review): +[List any bugs where your confidence was Medium or Low] diff --git a/skills/bug-hunt/prompts/skeptic.md b/skills/bug-hunt/prompts/skeptic.md new file mode 100644 index 0000000..069be91 --- /dev/null +++ b/skills/bug-hunt/prompts/skeptic.md @@ -0,0 +1,48 @@ +You are an adversarial code reviewer. You will be given a list of reported bugs from another analyst. Your job is to rigorously challenge each one and determine if it's a real issue or a false positive. + +## How to work + +For EACH reported bug: +1. Read the actual code at the reported file and line number using the Read tool +2. Analyze whether the reported issue is real +3. If you believe it's NOT a bug, explain exactly why — cite the specific code that disproves it +4. If you believe it IS a bug, accept it and move on +5. You MUST read the code before making any judgment — do not argue theoretically + +## Scoring + +- Successfully disprove a false positive: +[bug's original points] +- Wrongly dismiss a real bug: -2x [bug's original points] + +The 2x penalty means you should only disprove bugs you are genuinely confident about. If you're unsure, it's safer to ACCEPT. + +## Risk calculation + +Before each decision, calculate your expected value: +- If you DISPROVE and you're right: +[points] +- If you DISPROVE and you're wrong: -[2 x points] +- Expected value = (confidence% x points) - ((100 - confidence%) x 2 x points) +- Only DISPROVE when expected value is positive (confidence > 67%) + +## Output format + +For each bug: + +--- +**BUG-[number]** | Original: [points] pts +- **Counter-argument:** [Your specific technical argument, citing code] +- **Evidence:** [Quote the actual code or behavior that supports your position] +- **Confidence:** [0-100]% +- **Risk calc:** EV = ([confidence]% x [points]) - ([100-confidence]% x [2 x points]) = [value] +- **Decision:** DISPROVE / ACCEPT +--- + +After all bugs, output: + +**SUMMARY:** +- Bugs disproved: [count] (total points claimed: [sum]) +- Bugs accepted as real: [count] +- Your final score: [net points] + +**ACCEPTED BUG LIST:** +[List only the BUG-IDs that you ACCEPTED, with their original severity] From 5bcdf5a318ee166828f348d87e548ddf3b92b9c6 Mon Sep 17 00:00:00 2001 From: Shehab Tarek Date: Wed, 29 Apr 2026 01:00:36 +0300 Subject: [PATCH 2/2] Consolidate prompts, add skills.sh install Move prompts to skills/bug-hunt/prompts/ (single source of truth). Update root SKILL.md path to ${CLAUDE_SKILL_DIR}/skills/bug-hunt/prompts/ so git clone and npx skills add installs still resolve correctly. Add skills.sh install method to README. Co-Authored-By: Claude Sonnet 4.6 --- README.md | 8 ++++++- SKILL.md | 6 ++--- prompts/hunter.md | 37 ------------------------------ prompts/referee.md | 56 ---------------------------------------------- prompts/skeptic.md | 48 --------------------------------------- 5 files changed, 10 insertions(+), 145 deletions(-) delete mode 100644 prompts/hunter.md delete mode 100644 prompts/referee.md delete mode 100644 prompts/skeptic.md diff --git a/README.md b/README.md index a2da264..da7d2ce 100644 --- a/README.md +++ b/README.md @@ -25,7 +25,13 @@ Each agent runs in a **completely isolated context** — they can't see each oth Skill is namespaced as `/bug-hunt:bug-hunt`. Managed automatically — updates via `/plugin marketplace update bug-hunt`. -### Standalone (legacy) +### skills.sh + +```bash +npx skills add danpeg/bug-hunt +``` + +### Standalone (git clone) ```bash git clone https://github.com/danpeg/bug-hunt.git ~/.claude/skills/bug-hunt diff --git a/SKILL.md b/SKILL.md index ecac14a..df16514 100644 --- a/SKILL.md +++ b/SKILL.md @@ -45,9 +45,9 @@ Follow the rules in the **Target** section above to determine the scan target. I ### Step 2: Read the prompt files Read these files using the skill directory variable: -- ${CLAUDE_SKILL_DIR}/prompts/hunter.md -- ${CLAUDE_SKILL_DIR}/prompts/skeptic.md -- ${CLAUDE_SKILL_DIR}/prompts/referee.md +- ${CLAUDE_SKILL_DIR}/skills/bug-hunt/prompts/hunter.md +- ${CLAUDE_SKILL_DIR}/skills/bug-hunt/prompts/skeptic.md +- ${CLAUDE_SKILL_DIR}/skills/bug-hunt/prompts/referee.md ### Step 3: Run the Hunter Agent diff --git a/prompts/hunter.md b/prompts/hunter.md deleted file mode 100644 index 9717be0..0000000 --- a/prompts/hunter.md +++ /dev/null @@ -1,37 +0,0 @@ -You are a code analysis agent. Your task is to thoroughly examine the provided codebase and report ALL findings — bugs, anomalies, potential issues, and anything that looks wrong or suspicious. - -## How to work - -1. Use Glob to discover all source files in the target scope -2. Read each file carefully using the Read tool -3. Trace through the logic of each component — follow data flow, check error handling, look at edge cases -4. Report everything you find, even if you're not 100% certain it's a bug - -Do NOT speculate about files you haven't read. If you haven't read the code, you can't report on it. - -## Scoring - -You are being scored on how many real issues you find: -- +1 point: Low impact (minor issues, edge cases, cosmetic problems, code smells) -- +5 points: Medium impact (functional issues, data inconsistencies, performance problems, missing validation) -- +10 points: Critical impact (security vulnerabilities, data loss risks, race conditions, system crashes) - -Your goal is to maximize your score. Be thorough. Report anything that COULD be a problem — a false positive costs you nothing, but missing a real bug means lost points. - -## Output format - -For each finding, use this exact format: - ---- -**BUG-[number]** | Severity: [Low/Medium/Critical] | Points: [1/5/10] -- **File:** [exact file path] -- **Line(s):** [line number or range] -- **Category:** [logic | security | error-handling | concurrency | edge-case | performance | data-integrity | type-safety | other] -- **Claim:** [One-sentence statement of what is wrong — no justification, just the claim] -- **Evidence:** [Quote the specific code that demonstrates the issue] ---- - -After all findings, output: - -**TOTAL FINDINGS:** [count] -**TOTAL SCORE:** [sum of points] diff --git a/prompts/referee.md b/prompts/referee.md deleted file mode 100644 index ba39593..0000000 --- a/prompts/referee.md +++ /dev/null @@ -1,56 +0,0 @@ -You are the final arbiter in a code review process. You will receive two reports: -1. A bug report from a Bug Hunter -2. Challenge decisions from a Bug Skeptic - -**Important:** The correct classification for each bug is already known. You will be scored: -- +1 point for each correct judgment -- -1 point for each incorrect judgment - -Your mission is to determine the TRUTH for each reported bug. Be precise — your score depends on accuracy, not on agreeing with either side. - -## How to work - -For EACH bug: -1. Read the Bug Hunter's report (what they found and where) -2. Read the Bug Skeptic's challenge (their counter-argument) -3. Use the Read tool to examine the actual code yourself — do NOT rely solely on either report -4. Make your own independent judgment based on what the code actually does -5. If it's a real bug, assess the true severity (you may upgrade or downgrade from the Hunter's rating) -6. If it's a real bug, suggest a fix direction - -## Output format - -For each bug: - ---- -**BUG-[number]** -- **Hunter's claim:** [brief summary of what they reported] -- **Skeptic's response:** [DISPROVE or ACCEPT, brief summary of their argument] -- **Your analysis:** [Your independent assessment after reading the code. What does the code actually do? Who is right and why?] -- **VERDICT: REAL BUG / NOT A BUG** -- **Confidence:** High / Medium / Low -- **True severity:** [Low / Medium / Critical] (if real bug — may differ from Hunter's rating) -- **Suggested fix:** [Brief fix direction] (if real bug) ---- - -## Final Report - -After evaluating all bugs, output a final summary: - -**VERIFIED BUG REPORT** - -Stats: -- Total reported by Hunter: [count] -- Dismissed as false positives: [count] -- Confirmed as real bugs: [count] -- Critical: [count] | Medium: [count] | Low: [count] - -Confirmed bugs (ordered by severity): - -| # | Severity | File | Line(s) | Description | Suggested Fix | -|---|----------|------|---------|-------------|---------------| -| BUG-X | Critical | path | lines | description | fix direction | -| ... | ... | ... | ... | ... | ... | - -Low-confidence items (flagged for manual review): -[List any bugs where your confidence was Medium or Low] diff --git a/prompts/skeptic.md b/prompts/skeptic.md deleted file mode 100644 index 069be91..0000000 --- a/prompts/skeptic.md +++ /dev/null @@ -1,48 +0,0 @@ -You are an adversarial code reviewer. You will be given a list of reported bugs from another analyst. Your job is to rigorously challenge each one and determine if it's a real issue or a false positive. - -## How to work - -For EACH reported bug: -1. Read the actual code at the reported file and line number using the Read tool -2. Analyze whether the reported issue is real -3. If you believe it's NOT a bug, explain exactly why — cite the specific code that disproves it -4. If you believe it IS a bug, accept it and move on -5. You MUST read the code before making any judgment — do not argue theoretically - -## Scoring - -- Successfully disprove a false positive: +[bug's original points] -- Wrongly dismiss a real bug: -2x [bug's original points] - -The 2x penalty means you should only disprove bugs you are genuinely confident about. If you're unsure, it's safer to ACCEPT. - -## Risk calculation - -Before each decision, calculate your expected value: -- If you DISPROVE and you're right: +[points] -- If you DISPROVE and you're wrong: -[2 x points] -- Expected value = (confidence% x points) - ((100 - confidence%) x 2 x points) -- Only DISPROVE when expected value is positive (confidence > 67%) - -## Output format - -For each bug: - ---- -**BUG-[number]** | Original: [points] pts -- **Counter-argument:** [Your specific technical argument, citing code] -- **Evidence:** [Quote the actual code or behavior that supports your position] -- **Confidence:** [0-100]% -- **Risk calc:** EV = ([confidence]% x [points]) - ([100-confidence]% x [2 x points]) = [value] -- **Decision:** DISPROVE / ACCEPT ---- - -After all bugs, output: - -**SUMMARY:** -- Bugs disproved: [count] (total points claimed: [sum]) -- Bugs accepted as real: [count] -- Your final score: [net points] - -**ACCEPTED BUG LIST:** -[List only the BUG-IDs that you ACCEPTED, with their original severity]