Skip to content

feat(DRIVE): hybrid RAG (keyword + e5-large embedding) for NL edit grounding#153

Merged
dancinlife merged 1 commit into
mainfrom
feat/drive-rag-hybrid
Jun 4, 2026
Merged

feat(DRIVE): hybrid RAG (keyword + e5-large embedding) for NL edit grounding#153
dancinlife merged 1 commit into
mainfrom
feat/drive-rag-hybrid

Conversation

@dancinlife

Copy link
Copy Markdown
Contributor

What

DRIVE test-drive REPL now grounds the local model on curated fix-recipes via hybrid retrieval — keyword-precision first, embedding-recall fallback.

STAGE 1 keyword substring  →  exact, 0.02s, no model load (queries quote keys)
STAGE 2 in-process fastembed (multilingual-e5-large, query:/passage: prefixes)
        →  paraphrase fallback ONLY when STAGE 1 empty · conservative 0.80 thresh

Self-contained: no embedding server / LAN / VPN — fastembed runs in-process via ~/.drive-rag-venv. drive.hexa calls rag_retrieve.py, degrading to its inline keyword scan if the venv is absent. fix_recipes.txt is the runtime-editable KB.

Honest finding

Embedding alone underperforms keyword on this KB — MiniLM had an "area" attractor out-ranking correct matches; e5-large fixes it but still misses ~1/4 of in-vocabulary queries. Hence keyword-primary, embedding-fallback (not embedding-only).

Validation — _drivesim 100-scenario harness (70 code + 30 NL-git)

result
final 100/100 (96 first-pass + 4 model-flake recoveries on retry)
git NL (commit/push/branch) 30/30 first-pass
harness fix input race root-caused as CR-vs-LF (drive PTY input() needs \n) — drive itself was correct

Files

  • product: DRIVE/{drive.hexa, DRIVE.md, rag_retrieve.py, fix_recipes.txt}
  • harness: _drivesim/{gen.py, run.sh, run100.sh, retry.sh, drive_one.exp, manifest.json}
  • excluded (generated/cache): scenarios/, build/, fix_recipes.emb.json

🤖 Generated with Claude Code

…ounding

Ground the local test-drive model on curated fix-recipes so natural-language
edits land the correct change. Two-stage retrieval, precision-first:
- STAGE 1 keyword substring match (exact, 0.02s, no model load)
- STAGE 2 in-process fastembed (multilingual-e5-large, query:/passage: prefixes)
  as a paraphrase fallback only when STAGE 1 is empty; conservative 0.80 thresh.
Self-contained: no embedding server/LAN/VPN — fastembed runs in-process via
~/.drive-rag-venv. drive.hexa calls rag_retrieve.py, falling back to its inline
keyword scan if the venv is absent. fix_recipes.txt is the runtime-editable KB.

Honest finding: embedding alone underperforms keyword on this KB (MiniLM had an
"area" attractor that out-ranked correct matches; e5-large fixes it but still
misses ~1/4 of in-vocabulary queries) — hence keyword-primary, embedding-fallback.

Validated via _drivesim 100-scenario harness (70 code + 30 NL-git):
100/100 pass (96 first-pass + 4 model-flake recoveries on retry; all 30 git
scenarios first-pass). Root-caused the harness input race as a CR-vs-LF issue
(drive's PTY input() needs \n) — drive itself was correct.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@dancinlife dancinlife merged commit 60adc9b into main Jun 4, 2026
@dancinlife dancinlife deleted the feat/drive-rag-hybrid branch June 4, 2026 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant