Skip to content

feat: add get_relevant_context tool — token-budget-aware context selection #18

@bntvllnt

Description

@bntvllnt

Summary

Add a get_relevant_context MCP tool that takes a task description + token budget and returns the optimal subgraph of files to read. Agents waste 60-70% of context on irrelevant files — our graph knows which files matter.

Motivation

  • vexp proved 65-74% token reduction using AST-level subgraphs instead of grep
  • codebase-memory-mcp reports 99.2% fewer tokens (3,400 vs 412,000) for structural queries
  • Context engineering = "finding the smallest possible set of high-signal tokens" (Martin Fowler)
  • Our graph already has PageRank — we can rank files by importance and serve the optimal subset within a token budget

Proposed API

tool: get_relevant_context
input: {
  task: string,           // "refactor auth module" or "fix login bug"
  tokenBudget?: number,   // max tokens to return (default: 8000)
  scope?: string          // module or directory to focus on
}
output: {
  files: Array<{ path, relevanceScore, pageRank, summary }>,
  totalTokens: number,
  coverage: string        // "covers 85% of related dependency graph"
}

Approach

  1. Use BM25 search to find task-relevant files
  2. Expand via dependency graph (dependents + dependencies)
  3. Rank by PageRank + relevance score
  4. Truncate to fit token budget (highest-value files first)
  5. Return file summaries (exports, imports, metrics) not full contents

Priority

Immediate — Token efficiency is the #1 competitive differentiator for AI agent tooling.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions