research-scout turns Claude Code into a private research analyst. Arm your own research materials (PDFs, YouTube transcripts, articles) to control the evidence base, then run a single-domain pipeline that scouts primary sources by modality, chases each one down, and adversarially verifies it before it survives. Point it at a domain (Finance, Healthcare, Energy) with optional focus keywords and you get a structured, primary-sourced findings summary entirely under your control — weak findings get killed, not padded.
-
Python 3.11+ and Node 18+
-
Poppler — required for the armed-uploads PDF workflow. Claude Code reads PDFs via
pdftoppm(part of poppler); without it, uploaded PDFs cannot be mined and the uploads scout reportsDEGRADED: PDF unreadable — poppler not installed. Markdown/text/URL uploads work without it.OS Install Windows winget install poppler(orchoco install poppler), then reopen the terminal sopdftoppmis onPATHmacOS brew install popplerLinux sudo apt install poppler-utilsVerify with
pdftoppm -v.
git clone https://github.com/shiviancodes/research-scout
cd research-scoutFirst time only:
.\setup.ps1 # Windows
./setup.sh # Mac / LinuxEvery time:
.\start.ps1 # Windows
./start.sh # Mac / LinuxOr with make:
make setup # first time only
make start # every time
make stop # when doneOpen the URL printed by the start script (default: http://localhost:5173)
Research runs are driven by the /research slash command in Claude Code — one domain per run.
- (Optional) Arm uploads — drop materials into
inputs/<domain>/and set that domain's flag totrueininputs/settings.json(or use the dashboard Sources panel). Armed material becomes an extra scout modality for the next run, and the flag resets once consumed. - Open Claude Code at the project root and run:
claude - Run the pipeline:
/research energy "grid storage, municipal billing"- First argument is the domain —
finance|healthcare|energy. - Second argument (optional) is focus keywords that steer scouts and triage.
- Append
dryfor a cheap 2-scout / 2-candidate end-to-end test.
- First argument is the domain —
- Watch the stages report — scouts (parallel, one per modality) → triage → deep-dive → red-team (kills weak candidates) → assemble → synthesis. The orchestrator prints candidate counts per modality, kills with reasons, and the survivor count.
- View results — open the dashboard at
http://localhost:5173and refresh the History tab, or readoutputs/<domain>/<YYYY-WNN>-findings.mddirectly. - (Optional) Edit agents — the Config page lets you adjust the stage agent definitions, edit Standards, and restore factory defaults.
flowchart TB
subgraph input["Input"]
User["You<br/>Arm uploads (inputs/settings.json)<br/>Run /research domain focus"]
end
subgraph frontend["Frontend"]
Dashboard["React Dashboard @ :5173<br/>Home | Config | History | Research"]
end
subgraph backend["Backend"]
API["FastAPI @ :8766<br/>/api/inputs | /api/agents | /api/run"]
end
subgraph processing["AI Processing — /research pipeline (single domain)"]
direction TB
Scouts["Phase A · Scouts<br/>parallel, one per modality<br/>regulatory · earnings · audit · tenders · armed-uploads"]
Triage["Phase B · Triage<br/>dedup → rank → top 10"]
DeepDive["Phase C · Deep-dive<br/>one subagent per candidate"]
RedTeam["Phase D · Red-team<br/>adversarial verify · KILL by default"]
Assemble["Phase E · Assemble<br/>findings + registry + run log"]
Synthesis["Synthesis<br/>aggregate summary"]
Scouts --> Triage --> DeepDive --> RedTeam --> Assemble --> Synthesis
end
subgraph storage["Storage"]
Files["inputs/ - armed uploads<br/>outputs/ - findings + summaries<br/>findings-registry.json - cross-run memory<br/>.claude/agents/ - stage definitions"]
end
subgraph output["Output"]
Findings["Findings<br/>Problem | Source | Why now | Tags"]
end
User -->|Arm uploads| Dashboard
User -->|Run /research| Scouts
Dashboard -->|API calls| API
API -->|Serves| Dashboard
Files -->|Exclusion list| Scouts
Scouts -->|Read armed sources + web| Files
Assemble -->|Write findings + registry| Files
Files -->|Serve| API
Assemble -->|Generate| Findings
Dashboard -->|View| Findings
prompts/STANDARDS.md defines what makes a finding credible:
- Non-negotiables: Real named problem, primary source, defensible why-now
- Disqualifiers: No primary source, pure trend chasing, already solved
- Positive signals: sa-angle, regulatory-shift, contrarian, data-moat (help readers prioritise)
Edit directly or use Config → Standards in the dashboard. Changes take effect on the next run.
One run researches one domain through six stages: parallel scouts (one per source modality) → triage (dedup + rank to a shortlist) → deep-dive (one subagent per candidate, chasing the primary source) → red-team (adversarial verification) → assemble → synthesis. Driven by the /research slash command.
The red-team stage independently tries to kill every candidate and defaults to KILL when uncertain. Thin weeks (1–2 survivors) are expected output, not failure — better empty than generic. Weak findings are dropped, never padded.
Every candidate that reaches the red-team — survived or killed — is recorded in outputs/findings-registry.json. The next run builds an exclusion list from it so scouts don't resurface the same problems week over week.
The optional second argument to /research steers scouts and triage toward a theme (e.g. "grid storage, municipal billing") without hard-filtering — a strong off-focus signal still beats a weak on-focus one.
Arm your own research materials (PDFs, YouTube transcripts, markdown) per domain via inputs/settings.json (or the dashboard Sources panel). When armed, your uploads become an extra scout modality for the next run, and the flag resets once consumed.
Edit agent definitions in-browser:
- Standards — view/edit the quality bar reference
- Scout, Deep-dive, Red-team, Synthesis — edit the stage agent definitions
- Save posts changes; Restore-to-Default pulls factory copies
Standardised four-field findings (Problem, Source, Why now, Tags) make output actionable and machine-readable. A PostToolUse hook (scripts/lint_findings.py) checks the format mechanically on every findings-file write.
The synthesis agent aggregates findings by tag. No scoring, no tiers — humans decide what to investigate.
.claude/agents/ Stage agent definitions (scout, deep-dive, red-team, synthesis)
.claude/commands/ /research pipeline orchestrator
.claude/settings.json PostToolUse lint hook on findings writes
backend/ FastAPI - serves outputs/ + agent/source endpoints
frontend/ React + Vite dashboard
scripts/ Pipeline helpers (lint_findings, update_registry, log_run,
check_links, validate_domain)
outputs/ Generated artefacts
├─ energy/ Per-domain weekly findings (+ .run/ scratch dirs)
├─ finance/
├─ healthcare/
├─ summary/ Aggregated summaries per run
├─ findings-registry.json Cross-run finding memory (survivors + kills)
└─ run-log.jsonl Per-run telemetry
prompts/
├─ STANDARDS.md Quality bar
└─ sources/ Per-domain source packs, by scout modality
defaults/agents/ Factory copies of stage agents (restore-to-default)
inputs/ Armed research materials per domain (+ settings.json arming)
├─ energy/
├─ finance/
└─ healthcare/
docs/ Reference materials
| Service | Default | Variable |
|---|---|---|
| Backend | 8766 | BACKEND_PORT |
| Frontend | 5173 | FRONTEND_PORT |
Both are set in .env at the project root (copy .env.example to get started). Vite runs with strictPort: true - if a port is taken it fails loudly rather than silently shifting.
MIT - see LICENSE.

