Skip to content

docs: publish benchmarks — token reduction vs grep/Repomix #19

@bntvllnt

Description

@bntvllnt

Summary

Publish quantitative benchmarks comparing codebase-intelligence against grep-based exploration and Repomix-style full-repo packing. codebase-memory-mcp's "99.2% fewer tokens" claim is the most-cited number in this space — we need equivalent data.

Motivation

  • codebase-memory-mcp benchmarked across 64 real-world repos: 3,400 tokens vs 412,000 via file-by-file exploration
  • Repomix claims 70% token reduction via tree-sitter compression
  • We have NO published benchmarks despite having richer metrics
  • Benchmarks are marketing — they drive adoption decisions

Proposed Benchmarks

1. Token Reduction

  • Methodology: 5 common agent tasks (understand module, find hotspots, impact of change, find dead code, explore architecture)
  • Compare: tokens consumed via our MCP tools vs grep/read-all-files vs Repomix
  • Repos: 3-5 open-source TS projects of varying sizes (small/medium/large)

2. Accuracy

  • Methodology: For impact analysis — compare predicted affected files vs actual affected files (from real PR diffs)
  • Compare: our blast radius vs manual identification

3. Speed

  • Methodology: Parse + analyze time for repos of different sizes
  • Compare: cold start vs cached (git HEAD match)

Deliverables

  • Benchmark script (reproducible)
  • Results table in README
  • Blog post / docs page with methodology
  • Badge in README: "X% fewer tokens than grep"

Priority

Immediate — Can't prove value without numbers.

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentation

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions