Give an agent a target molecule and a host. It comes back with a biosynthetic discovery campaign: route families, enzyme-mining lanes, dark-step hypotheses, stitched pathways, construct ideas, and a compute-ready plan.
BioProspector is a local-first, agent-operated skill for discovering biosynthetic pathways. You describe the target in plain language, and the agent expands the route space, mines enzyme and gene candidates, reasons through the steps nobody has solved yet, checks whether the route fits your host, and hands back review-ready options your team can act on.
It runs on the compute you already have. Everything works from a laptop, and a single search lane escalates to RunPod, HPC, a cloud VM, or AWS ElasticBLAST only when that lane earns it. It also drops into whatever agent harness you run: Claude Code, Codex, Symphony + Linear, or your own tracker. One campaign drives them all, so the harness stays a deployment choice.
Three public example campaigns ship ready to run: vanillin, nootkatone, and Huperzine A. All are target-swappable for your own molecule.
%%{init:{'theme':'base','flowchart':{'htmlLabels':false,'padding':16,'subGraphTitleMargin':{'top':10,'bottom':18}},'themeVariables':{'fontFamily':'Menlo, Consolas, monospace','lineColor':'#7a7a7a','clusterBkg':'#0c0c0c','clusterBorder':'#3a3a3a','titleColor':'#dcdcdc'}}}%%
flowchart LR
classDef io fill:#0c0c0c,stroke:#5a5a5a,color:#ededed,stroke-width:1.5px
classDef accent fill:#0c0c0c,stroke:#bdf0a0,color:#bdf0a0,stroke-width:1.5px
A("TARGET MOLECULE<br/>+ HOST"):::io
B("EXPAND THE<br/>ROUTE SPACE"):::io
C("MINE ENZYME +<br/>GENE CANDIDATES"):::io
D("RESOLVE DARK STEPS<br/>STITCH · HOST-FIT"):::io
E("CONSTRUCT<br/>HYPOTHESES"):::accent
F("COMPUTE-READY<br/>WORK GRAPH"):::accent
A --> B --> C --> D --> E
D --> F
A campaign does the hard parts:
- Expands the route space: natural, engineered, fed-substrate, analog, reverse-catabolism, dark-step, and de novo families, so the agent weighs real alternatives early.
- Reasons about the unknown: missing chemistry, unknown genes, and hidden multi-gene steps become explicit, testable hypotheses with the counterevidence attached.
- Mines every reaction step: shortlist the genes, summarize the domains, keep the source pointers, and record what was rejected so the next agent skips known dead ends.
- Returns several routes: the minimal-gene option, the strongest-evidence option, the best host-fit, and an ambitious one, each with its trade-offs.
- Keeps claims honest: planning, execution, evidence, and validation stay separate, so the agent never reports a pathway as built or validated before the evidence exists.
%%{init:{'theme':'base','flowchart':{'htmlLabels':false,'padding':16,'subGraphTitleMargin':{'top':10,'bottom':18}},'themeVariables':{'fontFamily':'Menlo, Consolas, monospace','lineColor':'#7a7a7a','clusterBkg':'#0c0c0c','clusterBorder':'#3a3a3a','titleColor':'#dcdcdc'}}}%%
flowchart TD
classDef wide fill:#0c0c0c,stroke:#5a5a5a,color:#ededed,stroke-width:1.5px
classDef mid fill:#0c0c0c,stroke:#5a5a5a,color:#ededed,stroke-width:1.5px
classDef win fill:#0c0c0c,stroke:#bdf0a0,color:#bdf0a0,stroke-width:1.5px
T("TARGET MOLECULE + HOST"):::mid
subgraph EX["EXPLORE: keep the weird options alive"]
direction LR
R1("natural"):::wide
R2("engineered"):::wide
R3("fed-substrate"):::wide
R4("analog"):::wide
R5("reverse-catabolism"):::wide
R6("dark-step / de novo"):::wide
R1 ~~~ R2 ~~~ R3 ~~~ R4 ~~~ R5 ~~~ R6
end
M("MINE + RESOLVE + STITCH"):::mid
subgraph WIN["RETURN SEVERAL ROUTES"]
direction LR
P1("minimal-gene"):::win
P2("strongest-evidence"):::win
P3("best host-fit"):::win
P4("ambitious de novo"):::win
P1 ~~~ P2 ~~~ P3 ~~~ P4
end
T --> EX --> M --> WIN
%%{init:{'theme':'base','flowchart':{'htmlLabels':false,'padding':16,'subGraphTitleMargin':{'top':10,'bottom':18}},'themeVariables':{'fontFamily':'Menlo, Consolas, monospace','lineColor':'#7a7a7a','clusterBkg':'#0c0c0c','clusterBorder':'#3a3a3a','titleColor':'#dcdcdc'}}}%%
flowchart TD
classDef rung fill:#0c0c0c,stroke:#5a5a5a,color:#ededed,stroke-width:1.5px
classDef gate fill:#0c0c0c,stroke:#e0825c,color:#e0825c,stroke-width:1.5px
classDef claim fill:#0c0c0c,stroke:#bdf0a0,color:#bdf0a0,stroke-width:1.5px
L0("PLAN"):::rung
L1("TOOLS READY"):::rung
L2("INPUTS REAL"):::rung
L3("EXECUTION HAPPENED"):::rung
L4("EVIDENCE JOINED"):::rung
L5("AUDITED, EVIDENCE-BACKED CLAIMS"):::claim
G1{{"real execution proof"}}:::gate
G2{{"joins to the target + controls pass"}}:::gate
L0 --> L1 --> L2 --> G1 --> L3 --> G2 --> L4 --> L5
The public examples are planning-only. They stop before any validated claim, because no real search has run.
Start on a laptop. Escalate one lane to heavier compute only when the science calls for it, and only after you approve the budget and credentials outside this repo. Switch agent harnesses without rewriting the campaign.
%%{init:{'theme':'base','flowchart':{'htmlLabels':false,'padding':16,'subGraphTitleMargin':{'top':10,'bottom':18}},'themeVariables':{'fontFamily':'Menlo, Consolas, monospace','lineColor':'#7a7a7a','clusterBkg':'#0c0c0c','clusterBorder':'#3a3a3a','titleColor':'#dcdcdc'}}}%%
flowchart LR
classDef io fill:#0c0c0c,stroke:#5a5a5a,color:#ededed,stroke-width:1.5px
classDef hub fill:#0c0c0c,stroke:#bdf0a0,color:#bdf0a0,stroke-width:1.5px
subgraph H["ANY AGENT HARNESS"]
direction TB
H1("Claude Code"):::io
H2("Codex"):::io
H3("Symphony + Linear"):::io
H4("your tracker"):::io
end
C(("ONE CAMPAIGN<br/>CONTRACT")):::hub
subgraph P["COMPUTE YOU CHOOSE"]
direction TB
P1("laptop"):::io
P2("RunPod"):::io
P3("HPC / SSH"):::io
P4("cloud / neocloud VM"):::io
P5("AWS ElasticBLAST"):::io
end
H1 --> C
H2 --> C
H3 --> C
H4 --> C
C --> P1
C --> P2
C --> P3
C --> P4
C --> P5
The checkout stays small, forkable, and auditable. It carries the skill, prompts, schemas, validators, and the compact summaries and rankings a campaign produces. The heavy data (raw reads, database snapshots, model weights, full search outputs) stays in storage you own. The checkout holds pointers and checksums into it.
%%{init:{'theme':'base','flowchart':{'htmlLabels':false,'padding':16,'subGraphTitleMargin':{'top':10,'bottom':18}},'themeVariables':{'fontFamily':'Menlo, Consolas, monospace','lineColor':'#7a7a7a','clusterBkg':'#0c0c0c','clusterBorder':'#3a3a3a','titleColor':'#dcdcdc'}}}%%
flowchart LR
classDef repo fill:#0c0c0c,stroke:#bdf0a0,color:#bdf0a0,stroke-width:1.5px
classDef ext fill:#0c0c0c,stroke:#e0825c,color:#e0825c,stroke-width:1.5px
subgraph IN["IN THE CHECKOUT · small · forkable · auditable"]
direction TB
R1("skill + prompts"):::repo
R2("schemas + validators"):::repo
R3("summaries · rankings"):::repo
R4("pointers + checksums"):::repo
end
subgraph OUT["OPERATOR-OWNED · heavy · stays put"]
direction TB
E1("raw reads / FASTA"):::ext
E2("database snapshots · model weights"):::ext
E3("full search outputs · provider workdirs"):::ext
end
R4 -. "reference by path + checksum" .-> OUT
skills/bioprospector/ the skill: SKILL.md, CLIs, example campaigns, references
docs/ user and agent documentation (start with QUICKSTART.md)
templates/ issue templates the agent draws from
demos/ demo maps and sample outputs
schemas/ shared campaign + ledger contracts
src/ installable bioprospector CLI
tests/ validators and contract checks
You don't need to run any of this yourself; your agent does. These confirm the skill installed cleanly and show what it produces:
python3 skills/bioprospector/scripts/bioprospector_doctor.py --include-runtime
make local-demoIn about five minutes you get a campaign on the Huperzine A example: the explored route space, mined candidates with source pointers, a ranked set of routes, a metadata-only gene-cluster plan, and a compact review package, with every claim labeled for how far the evidence goes.
New here? Start with docs/QUICKSTART.md and docs/WORKFLOWS.md. To run a campaign for your own molecule, see docs/FIRST_CAMPAIGN.md. Copy-paste agent prompts live in docs/AGENT_PLAYBOOK.md.
Once the skill is installed, describe the campaign and the agent runs it:
Use the bioprospector skill in this checkout. Run doctor, keep everything local,
and start a campaign for <target molecule> in <host>. Explore the route space,
draft construct-oriented work lanes, and return a short review package under .runtime/.
Use BioProspector to resolve the dark steps in the Huperzine A example: turn the
unknown chemistry into single-gene and multi-gene hypotheses with counterevidence,
then tell me the cheapest next experiment that would tell them apart.
BioProspector plans and reasons. It does not validate biology, and it never claims a route is produced, validated in host, assay-proven, or production-ready on its own. Those need real execution, joined evidence, controls, and expert review. See NON_CLAIMS.md for the boundary and docs/no-false-success-gates.md for how the gates work.
docs/PUBLIC_LAUNCH_PAD.md is the full capability map and workflow reference. The canonical agent skill is skills/bioprospector/SKILL.md. The command surface is in docs/CLI_REFERENCE.md, and the repository boundary is detailed in docs/PRIVACY_SECURITY_MODEL.md.
Under the Hood: The Artifact Contract
A campaign is backed by a set of compact, versioned ledgers and review artifacts: route and reaction-step ledgers, candidate funnels and rankings, dark-step and unknown-gene hypotheses, evidence-event and proof rows, provider readiness bundles, claim records, and the review dossier that indexes them. This is the machinery behind the behaviors above; you rarely touch it directly. The full list and shared contract live in docs/capability-map.md and schemas/bioprospector-ledgers.json.
