Skip to content

glitchbunny0/orb-evo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

orb-evo

Evolutionary prompt optimization for Orb's pipeline.

Two optimization modes:

  • Editor evolution — evolves editor prompt strings (patch/rewrite instructions). Scored deterministically via Orb's own audit scanners.
  • Director evolution — evolves the system prompt or director preamble by running director → writer → LLM judge. Scored on writing quality against real conversation data mined from Orb's DB.

Uses DSPy + a custom evolutionary loop. The strong API model handles mutation, prompt writing, analysis, and judging. A local model runs the Orb pipeline as the optimization target.

How It Works

Editor Mode

1. Load baseline prompt from Orb's tool_defs.py
2. Generate synthetic eval examples with known flaws (banned phrases, repetition, etc.)
3. Evaluate baseline → get scores + traces
4. For each generation:
   a. Mutator (API model): analyze traces, propose mutation strategy
   b. Prompt Writer (API model): write new candidate from strategy
   c. Validate constraints (size, growth, tool references, injection safety)
   d. Evaluate candidate via Orb's actual editor pipeline (ReAct tool-calling loop)
   e. Score with deterministic audit
5. Return best candidate → save evolved prompt + diff + metrics

Director Mode

1. Mine real conversations from Orb's DB (character cards, history, director/writer outputs)
2. Load baseline system prompt or director preamble from Orb
3. Evaluate baseline: director call → writer call → LLM judge scores output 0-10
4. For each generation:
   a. Mutator (API model): analyze judge feedback, propose mutation strategy
   b. Prompt Writer (API model): rewrite prompt following strategy
   c. Validate constraints (size, growth)
   d. Evaluate: run full director → writer → judge pipeline with candidate prompt
5. Return best candidate → save evolved prompt + diff + metrics

Setup

git clone <repo-url> && cd orb-evo
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

Requires a local Orb checkout. Auto-discovered at ~/repos/Orb/ or set ORB_REPO_PATH.

Usage

Mine Conversations (Director Mode Only)

# Extract real conversation data from Orb's DB
python -m orb_evo.cli mine --output data/conversations.jsonl

Editor Evolution

# Dry run
python -m orb_evo.cli evolve-editor --prompt editor_preamble --dry-run

# Full run
bash run.sh evolve-editor --prompt editor_preamble \
  --target-model "openai/gemma-4-31B-it-Q5_K_M.gguf" \
  --target-base http://localhost:8081/v1 \
  --mutator-model "zai/glm-5.1" \
  --mutator-base "https://api.z.ai/api/coding/paas/v4" \
  --iterations 3 --population 3 --dataset-size 5

Director Evolution

# Dry run
python -m orb_evo.cli evolve-director --evolve system_prompt --dry-run

# Evolve system prompt
bash run.sh evolve-director \
  --target-model "openai/gemma-4-31B-it-Q5_K_M.gguf" \
  --target-base http://localhost:8081/v1 \
  --mutator-model "zai/glm-5.1" \
  --mutator-base "https://api.z.ai/api/coding/paas/v4" \
  --evolve system_prompt \
  --dataset-size 5 --iterations 5

# Evolve director preamble instead
bash run.sh evolve-director \
  --target-model "openai/gemma-4-31B-it-Q5_K_M.gguf" \
  --target-base http://localhost:8081/v1 \
  --mutator-model "zai/glm-5.1" \
  --mutator-base "https://api.z.ai/api/coding/paas/v4" \
  --evolve director_preamble \
  --dataset-size 5 --iterations 5

Note: Use run.sh to ensure ZAI_API_KEY is loaded from ~/.bashrc. Background processes don't inherit interactive shell env vars.

CLI Commands

Command Description
mine Mine real conversations from Orb's DB into JSONL
evolve-editor Evolve editor prompt strings (deterministic scoring)
evolve-director Evolve system prompt or director preamble (LLM judge scoring)

Editor Options

Flag Default Description
--prompt required Prompt name to evolve (see Available Prompts)
--target-model openai/gemma-4-31b Local model being optimized
--target-base http://localhost:8081/v1 Target model API base URL
--mutator-model openai/glm-5.1 API model for mutation/analysis
--mutator-base auto API base URL for mutator
--iterations 10 Number of evolutionary generations
--population 5 Population size per generation
--dataset-size 20 Number of eval examples

Director Options

Flag Default Description
--evolve system_prompt What to evolve: system_prompt or director_preamble
--mined-data data/conversations.jsonl Path to mined conversation data
--target-model openai/gemma-4-31B-it-Q5_K_M.gguf Local model being optimized
--target-base http://localhost:8081/v1 Target model API base URL
--mutator-model zai/glm-5.1 API model for mutation/analysis/judging
--mutator-base auto API base URL for mutator
--iterations 5 Number of evolutionary generations
--population 3 Population size per generation
--dataset-size 5 Number of eval examples
--seed random Random seed for example selection (omit for random)
--max-history 0 Max conversation messages per example (0 = all)
--mined-data data/conversations.jsonl Path to mined conversation data

Output

Results are saved to output/{model_slug}/{target}/{timestamp}/:

File Contents
best_prompt.txt Winning prompt
results.json Full structured results with all candidates and scores
diff.patch Unified diff from baseline (if changed)

Architecture

Model roles:

  • Target model (local) — runs the Orb pipeline with candidate prompts. This is the model being optimized for.
  • Mutator (API) — analyzes evaluation traces and proposes mutation strategies.
  • Prompt Writer (API) — takes a mutation strategy and writes a concrete new prompt.
  • Judge (API, director mode only) — rates writer output quality on a 0-10 scale.

Scoring:

  • Editor mode: deterministic composite of fix rate, text preservation, length compliance, no new issues. Uses Orb's own audit.
  • Director mode: LLM judge rates writer output quality across prose craft, character voice, engagement, consistency, creativity.

Constraints: Size cap, growth limit, tool-name reference check, injection safety regex.

Available Prompts (Editor Mode)

Name Orb Constant
editor_preamble EDITOR_PREAMBLE
editor_patch EDITOR_PATCH_INSTRUCTIONS
editor_rewrite EDITOR_REWRITE_INSTRUCTIONS
editor_both EDITOR_BOTH_INSTRUCTIONS
editor_structural STRUCTURAL_REWRITE_INSTRUCTIONS
director_preamble DIRECTOR_PREAMBLE
director_scene _DIRECT_SCENE_DESCRIPTION

Development

pip install -e ".[dev]"
python -m pytest tests/ -v

License

MIT

About

Evolutionary prompt optimization for Orb's pipeline

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors