Codegraph MCP

A self-use Rust MCP server that gives AI coding agents persistent code understanding and memory across sessions. Built as a personal tool to explore what happens when you give an LLM structured access to a codebase's symbol graph, session state, and accumulated project knowledge.

What It Does

Codegraph runs as an MCP server (stdio transport) and exposes 26 tools that an AI agent can call:

Code Graph — Parses source code with tree-sitter (Rust, TypeScript, JavaScript, Python, Go), extracts symbols and their relationships (calls, imports, inherits), and stores them as a directed graph. The agent can search symbols, traverse dependencies, and understand file structure without reading entire files.
Session Memory — Tracks the agent's current task, subtasks, decisions, and working context. Survives context window compaction so the agent can resume where it left off.
Learning System — Records patterns (things that worked), failures (gotchas to avoid), and solution lineage (attempt chains with outcomes). A reflection engine converts outcomes into reusable knowledge. A suggestion system combines all three to recommend approaches for new tasks.
Skill Distillation — Generates a SKILL.md from accumulated patterns, failures, and conventions — a machine-readable summary of project-specific knowledge.
Cross-Language Inference — Detects REST/GraphQL calls in frontend code and matches them to backend route definitions.
Bash Compression — Compresses verbose command output (git status, test results, directory listings) to reduce token usage.

Tech Stack

Component	Technology
Language	Rust (async, tokio)
Protocol	MCP over stdio (JSON-RPC 2.0)
Parsing	tree-sitter (5 language grammars)
Graph	petgraph (directed graph with BFS traversal)
Storage	libSQL / SQLite (two databases: code graph + learning)
Hashing	xxh3 (content-based change detection)
Config	TOML

Architecture

src/
├── main.rs              # Entry point
├── config.rs            # Project root detection, config.toml
├── mcp/                 # MCP protocol layer
│   ├── protocol.rs      # JSON-RPC 2.0 + MCP types
│   ├── transport.rs     # Stdio transport
│   ├── server.rs        # Request dispatch, lazy init
│   └── tools.rs         # Tool registry (26 tools)
├── store/               # Persistence
│   ├── db.rs            # SQLite CRUD
│   ├── graph.rs         # In-memory petgraph
│   └── migrations.rs    # Schema versioning
├── code/                # Code analysis
│   ├── parser.rs        # tree-sitter symbol extraction
│   ├── indexer.rs       # Incremental indexing + cross-file resolution
│   ├── languages.rs     # Language configs + grammars
│   └── cross_language.rs
├── session/             # Session state machine
│   └── state.rs         # Task, decisions, context tracking
├── learning/            # Learning system
│   ├── patterns.rs      # Pattern storage + scoped queries
│   ├── failures.rs      # Failure records + severity
│   ├── confidence.rs    # Time decay + drift detection
│   ├── lineage.rs       # Solution attempt tracking
│   ├── reflection.rs    # Outcome → pattern/failure conversion
│   ├── niches.rs        # Behavioral clustering
│   └── sync.rs          # JSON export
├── skill/               # Skill distillation
│   ├── distill.rs       # Pattern → SKILL.md generation
│   ├── conventions.rs   # Convention clustering
│   └── render.rs        # Markdown rendering
└── compress/            # Token-saving output compression
    ├── bash.rs          # Command dispatch
    ├── git.rs           # Git output compression
    ├── test_output.rs   # Test result compression
    └── analytics.rs     # Savings tracking

Setup

1. Build

git clone https://github.com/subhankar-chowdhury/codegraph-mcp.git
cd codegraph-mcp
cargo build --release

The binary will be at target/release/codegraph.

2. Add to your MCP client

The server communicates over stdio (newline-delimited JSON-RPC 2.0) — your MCP client must connect via stdio transport, not HTTP/SSE. Add it to whichever MCP client you use:

Claude Code (~/.claude.json):

{
  "mcpServers": {
    "codegraph": {
      "command": "/absolute/path/to/codegraph-mcp/target/release/codegraph",
      "type": "stdio"
    }
  }
}

Cursor (.cursor/mcp.json in your project root):

{
  "mcpServers": {
    "codegraph": {
      "command": "/absolute/path/to/codegraph-mcp/target/release/codegraph",
      "type": "stdio"
    }
  }
}

Replace /absolute/path/to/codegraph-mcp with wherever you cloned the repo.

3. First run

When you start a session in any git repo, Codegraph will:

Auto-detect the project root (walks up looking for .git/)
Create a .codegraph/ directory with a default config.toml
Wait for you to index — run index_project(full: true) to build the code graph

After the initial index, subsequent sessions only need index_project() (incremental — skips unchanged files) or nothing at all if you haven't changed code.

4. Configuration (optional)

Edit .codegraph/config.toml to customize:

[indexing]
exclude = ["node_modules", "target", ".git", "dist", "build", "__pycache__"]
max_file_size = 1048576  # 1 MiB

[learning]
decay_half_life = 90  # days

[cross_language]
enabled = true

Running tests

cargo test              # 87 tests
cargo bench             # criterion benchmarks

Benchmarks

Independently tested on a 111K-line Python codebase across 5 configurations: Vanilla (grep/read), Serena (LSP), Codegraph, Serena+Codegraph, and Codegraph with bad usage patterns. All numbers from actual measured runs.

Multi-Step Refactoring Task (The Real Test)

A realistic 10-step investigation task (read method, find callers, check tests, understand interfaces, produce refactoring plan). This is where per-query savings compound.

Rank	Config	Tokens	vs Vanilla
1	CG-only (compact)	36,790	-23%
2	S+CG	45,424	-5%
3	Vanilla	47,693	baseline
4	Serena-only	76,228	+60%
5	CG-only (bad usage)	78,051	+64%

Same accuracy across all five configs. Pure cost difference. Compact mode is the default — the agent gets overviews first and requests source for specific symbols only when needed.

Per-Query Token Savings

Query	Grep/Read	Codegraph	Savings
"Who uses Diagnosis?"	42,478 tokens	151 tokens	99.6% (281x)
"What does bootstrap() call?"	9,431 tokens	366 tokens	96% (26x)
"What's in diagnosis.py?"	14,503 tokens	1,382 tokens	90% (10x)
"Resume after compaction"	~20,000 tokens	95 tokens	99.5% (210x)

Codegraph's cost scales with answer size, not file size — the gap widens on larger codebases.

Learning System (10-Session Evolution Test)

10 sequential sessions with overlapping themes. Sessions 9-10 deliberately overlapped with earlier sessions to test knowledge compounding.

Approach generation: Working. suggest_approach returned increasingly specific strategies across sessions. By sessions 9-10, it explicitly synthesized learnings from earlier sessions ("Leverage Session 1 + 2 + 4. Search for redis.RedisError catches.").
Pattern retrieval: Working. recall_patterns surfaces relevant patterns scoped by file paths and tags, with confidence scoring and time decay.
Failure recall: Working. recall_failures always includes critical-severity failures and filters others by scope relevance.

See BENCHMARK.md for full methodology, all 5 configs, per-task breakdowns, and accuracy analysis.

What I Learned

This was a self-use project to explore the design space of "AI agent memory." Some takeaways:

Compact mode is everything. Codegraph with compact=true (the default) is 23% cheaper and 32% faster than vanilla grep/read. With include_source=true on whole files, it's 64% worse. The value is in structured overviews, not dumping source through an extra protocol layer.
The code graph's value compounds over a session. On isolated single queries, fixed agent overhead buries the savings. On multi-step tasks with 18+ queries, per-query savings (26-281x) dominate.
Session memory is genuinely helpful for long-running tasks that hit context limits, though its value shrinks as context windows grow.
The learning system compounds. Approach generation noticeably improves over sessions as patterns accumulate. The "When X, do Y because Z" format produces actionable, reusable knowledge.
MCP as a protocol works well for this kind of tool. Stdio transport is simple, the tool/resource model is clean, and lazy initialization (waiting for the client's initialize handshake to resolve the project root) was the right call.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.codegraph		.codegraph
benches		benches
migrations		migrations
src		src
tests		tests
.gitignore		.gitignore
BENCHMARK.md		BENCHMARK.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
add_claude.md		add_claude.md
mcp-config.json		mcp-config.json
skill-global.md		skill-global.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Codegraph MCP

What It Does

Tech Stack

Architecture

Setup

1. Build

2. Add to your MCP client

3. First run

4. Configuration (optional)

Running tests

Benchmarks

Multi-Step Refactoring Task (The Real Test)

Per-Query Token Savings

Learning System (10-Session Evolution Test)

What I Learned

License

About

Uh oh!

Releases

Packages

Languages

websines/codegraph-mcp

Folders and files

Latest commit

History

Repository files navigation

Codegraph MCP

What It Does

Tech Stack

Architecture

Setup

1. Build

2. Add to your MCP client

3. First run

4. Configuration (optional)

Running tests

Benchmarks

Multi-Step Refactoring Task (The Real Test)

Per-Query Token Savings

Learning System (10-Session Evolution Test)

What I Learned

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages