Skip to content

feat(mcp): add query_graph tool with 8 intention-named relationship patterns#145

Merged
ArtemisMucaj merged 3 commits into
mainfrom
claude/add-query-graph-tool-bopUa
Apr 13, 2026
Merged

feat(mcp): add query_graph tool with 8 intention-named relationship patterns#145
ArtemisMucaj merged 3 commits into
mainfrom
claude/add-query-graph-tool-bopUa

Conversation

@ArtemisMucaj
Copy link
Copy Markdown
Owner

@ArtemisMucaj ArtemisMucaj commented Apr 13, 2026

Adds a new MCP tool that exposes call-graph queries via a clean, vocabulary-driven
API. Instead of receiving all relationship kinds at once (as get_symbol_context
does), callers specify exactly one pattern and get deduplicated results for only
that relationship type.

Supported patterns:

  • callers_of / callees_of — direct call graph traversal
  • imports_of / importers_of — Import-kind edges only
  • inheritors_of / children_of — Inheritance + Implementation edges
  • tests_for — callers whose symbol or file path matches test/spec conventions
  • file_summary — all symbols referenced within a given file

Results are deduplicated by symbol name (first occurrence wins) and capped at
50 nodes by default (max 500). No new infrastructure: all patterns dispatch
directly to existing CallGraphUseCase methods with the appropriate
reference_kind filter.

https://claude.ai/code/session_01GoKejqoqkgxiug48Hp5LE8

Summary by CodeRabbit

Release Notes

  • New Features
    • Added query_graph capability for querying and analyzing code relationships and patterns.
    • Filter queries by pattern, target, and optional repository identifier.
    • Server-side query limits ensure reliable performance.
    • Returns deduplicated results across multiple relationship types.

…atterns

Adds a new MCP tool that exposes call-graph queries via a clean, vocabulary-driven
API. Instead of receiving all relationship kinds at once (as get_symbol_context
does), callers specify exactly one pattern and get deduplicated results for only
that relationship type.

Supported patterns:
- callers_of / callees_of  — direct call graph traversal
- imports_of / importers_of — Import-kind edges only
- inheritors_of / children_of — Inheritance + Implementation edges
- tests_for — callers whose symbol or file path matches test/spec conventions
- file_summary — all symbols referenced within a given file

Results are deduplicated by symbol name (first occurrence wins) and capped at
50 nodes by default (max 500). No new infrastructure: all patterns dispatch
directly to existing CallGraphUseCase methods with the appropriate
reference_kind filter.

https://claude.ai/code/session_01GoKejqoqkgxiug48Hp5LE8
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 13, 2026

Warning

Rate limit exceeded

@ArtemisMucaj has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 39 minutes and 39 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 39 minutes and 39 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4618fa6b-e247-46a1-a1bc-ddd5e2ba6645

📥 Commits

Reviewing files that changed from the base of the PR and between 5588471 and f586761.

📒 Files selected for processing (1)
  • src/connector/adapter/mcp/server.rs
📝 Walkthrough

Walkthrough

A new MCP tool query_graph was added to the server, accepting query parameters (pattern, target, repository_id, limit) and returning deduplicated graph query results. The implementation includes server-side query limits, pattern-based dispatch to different graph query methods, and deduplication using a HashSet keyed by symbol name.

Changes

Cohort / File(s) Summary
Query Graph MCP Tool
src/connector/adapter/mcp/server.rs
Added new query_graph tool with input struct QueryGraphInput containing pattern, target, optional repository_id, and limit fields. Introduced GraphQueryNode and GraphQueryResult output structs. Implemented pattern-based dispatch logic routing to specific CallGraphQuery methods (callers, callees, imports via reference_kind, inheritance/implementation via reference_kind, tests with heuristic filtering, file_summary via file lookup). Added HashSet-based deduplication logic keyed by symbol name. Included server-side MAX_QUERY_LIMIT enforcement and updated serde imports to include Serialize.

Sequence Diagram

sequenceDiagram
    participant Client
    participant MCPServer as MCP Server
    participant QueryDispatcher as Query Dispatcher
    participant GraphQuery as CallGraphQuery
    participant Deduplicator as Deduplication<br/>(HashSet)

    Client->>MCPServer: query_graph(pattern, target, limit)
    MCPServer->>MCPServer: Validate limit against MAX_QUERY_LIMIT
    MCPServer->>QueryDispatcher: Dispatch based on pattern<br/>(callers/callees/imports/inheritance/tests/file_summary)
    QueryDispatcher->>GraphQuery: Call appropriate query method<br/>(e.g., get_callers, get_callees)
    GraphQuery-->>QueryDispatcher: Return raw nodes
    QueryDispatcher->>Deduplicator: Insert nodes into HashSet<br/>keyed by symbol
    Deduplicator->>Deduplicator: Skip duplicate symbols
    Deduplicator-->>MCPServer: Deduplicated nodes
    MCPServer->>MCPServer: Build GraphQueryResult
    MCPServer-->>Client: Return GraphQueryResult<br/>(pattern, target, nodes, total)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • feat: expose search as mcp server #38: Modifies the same MCP server file to add new tool endpoints and public input/output types for the search_code functionality, establishing a pattern for expanding MCP server capabilities.

Poem

🐰 A graph query tool hops into place,
Deduplicating nodes with such grace,
Patterns dispatch with logical flair,
Server-side limits keep queries fair,
The MCP now queries with elegant care! 🌟

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a new MCP tool called query_graph with 8 relationship patterns. It is concise, specific, and directly reflects the primary objective of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/add-query-graph-tool-bopUa

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/connector/adapter/mcp/server.rs (3)

432-442: Clarify the fallback behavior when caller_symbol is None.

When caller_symbol() returns None, the code falls back to using caller_file_path() as the symbol (line 435). This means the symbol field in GraphQueryNode could contain a file path rather than a symbol name, which may be unexpected for consumers.

Consider either documenting this behavior in the struct's doc comments, or filtering out references without a caller symbol when use_caller is true.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/connector/adapter/mcp/server.rs` around lines 432 - 442, The code
currently falls back to caller_file_path() when caller_symbol() is None which
can put file paths into the GraphQueryNode.symbol field; change the filter in
the iterator (the closure using use_caller, caller_symbol(), caller_file_path(),
and seen.insert(...)) so that when use_caller is true you only accept entries
that have a Some(caller_symbol) (i.e., treat None as filtered out) instead of
falling back to caller_file_path(), and return None for those items so
GraphQueryNode always receives a real symbol; update any doc comment on
GraphQueryNode to note this stricter behavior if needed.

349-365: inheritors_of and children_of may fetch up to twice the requested limit.

Both q_inh and q_imp queries have the full limit applied (line 305), meaning the combined results could be up to 2 × limit rows before deduplication. While take(limit) at line 451 ensures the final output is capped, this could be inefficient for large limits.

Consider halving the per-query limit or implementing an OR filter for reference_kind at the repository level if this becomes a performance concern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/connector/adapter/mcp/server.rs` around lines 349 - 365, The current
handling for the "inheritors_of" (and similarly "children_of") branch builds two
queries q_inh and q_imp from base_query.clone().with_reference_kind(...) each
using the full requested limit, which can yield up to 2×limit rows before
deduplication; adjust by halving the per-query limit (e.g., compute per_limit =
(limit + 1) / 2 and apply it to both q_inh and q_imp) or, better, modify the
repository/query layer to accept an OR filter on reference_kind so a single
use_case.find_callers call returns combined results with the requested limit;
update the code paths that create q_inh/q_imp and calls to use_case.find_callers
accordingly (symbols: inheritors_of, children_of, q_inh, q_imp,
base_query.with_reference_kind, use_case.find_callers, and the final take(limit)
logic).

104-122: Consider using an enum for pattern for better type safety and documentation.

Using a String requires runtime validation and produces less descriptive JSON schemas. An enum would catch invalid patterns at deserialization time and self-document the API.

♻️ Proposed refactor using an enum
#[derive(Debug, Deserialize, JsonSchema)]
#[serde(rename_all = "snake_case")]
pub enum QueryPattern {
    CallersOf,
    CalleesOf,
    ImportsOf,
    ImportersOf,
    InheritorsOf,
    ChildrenOf,
    TestsFor,
    FileSummary,
}

#[derive(Debug, Deserialize, JsonSchema)]
pub struct QueryGraphInput {
    /// Relationship pattern to query.
    pub pattern: QueryPattern,
    // ... rest unchanged
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/connector/adapter/mcp/server.rs` around lines 104 - 122, The pattern
field on QueryGraphInput is a plain String causing runtime validation and weak
JSON schema; replace it with a typed enum (e.g., QueryPattern) by declaring a
#[derive(Debug, Deserialize, JsonSchema)] enum QueryPattern with variants
CallersOf, CalleesOf, ImportsOf, ImportersOf, InheritorsOf, ChildrenOf,
TestsFor, FileSummary and annotate the enum with #[serde(rename_all =
"snake_case")] so deserialization accepts the current snake_case names; then
change QueryGraphInput::pattern to type QueryPattern, update the doc comment to
reflect the enum, and fix any call sites/tests that construct QueryGraphInput or
expect a string to use the enum or its string form.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/connector/adapter/mcp/server.rs`:
- Around line 392-404: The current filter closure over refs (using
caller_symbol() and reference_file_path()) uses file.contains("test") which
produces false positives; update the logic in that closure to parse the path
from reference_file_path() (use std::path::Path) and check path components and
the file stem instead of a raw substring: inspect each Path component for an
exact folder named "test" or test-like filenames by looking at the file
name/stem (e.g., starts_with("test_"), ends_with("_test"), or equals("test")) to
avoid matching substrings like "contest"; keep the existing symbol checks
(caller_symbol() patterns) and return the filtered Vec as before.

---

Nitpick comments:
In `@src/connector/adapter/mcp/server.rs`:
- Around line 432-442: The code currently falls back to caller_file_path() when
caller_symbol() is None which can put file paths into the GraphQueryNode.symbol
field; change the filter in the iterator (the closure using use_caller,
caller_symbol(), caller_file_path(), and seen.insert(...)) so that when
use_caller is true you only accept entries that have a Some(caller_symbol)
(i.e., treat None as filtered out) instead of falling back to
caller_file_path(), and return None for those items so GraphQueryNode always
receives a real symbol; update any doc comment on GraphQueryNode to note this
stricter behavior if needed.
- Around line 349-365: The current handling for the "inheritors_of" (and
similarly "children_of") branch builds two queries q_inh and q_imp from
base_query.clone().with_reference_kind(...) each using the full requested limit,
which can yield up to 2×limit rows before deduplication; adjust by halving the
per-query limit (e.g., compute per_limit = (limit + 1) / 2 and apply it to both
q_inh and q_imp) or, better, modify the repository/query layer to accept an OR
filter on reference_kind so a single use_case.find_callers call returns combined
results with the requested limit; update the code paths that create q_inh/q_imp
and calls to use_case.find_callers accordingly (symbols: inheritors_of,
children_of, q_inh, q_imp, base_query.with_reference_kind,
use_case.find_callers, and the final take(limit) logic).
- Around line 104-122: The pattern field on QueryGraphInput is a plain String
causing runtime validation and weak JSON schema; replace it with a typed enum
(e.g., QueryPattern) by declaring a #[derive(Debug, Deserialize, JsonSchema)]
enum QueryPattern with variants CallersOf, CalleesOf, ImportsOf, ImportersOf,
InheritorsOf, ChildrenOf, TestsFor, FileSummary and annotate the enum with
#[serde(rename_all = "snake_case")] so deserialization accepts the current
snake_case names; then change QueryGraphInput::pattern to type QueryPattern,
update the doc comment to reflect the enum, and fix any call sites/tests that
construct QueryGraphInput or expect a string to use the enum or its string form.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 99847f0a-8953-4828-a585-f8672fb451af

📥 Commits

Reviewing files that changed from the base of the PR and between 7715eab and 5588471.

📒 Files selected for processing (1)
  • src/connector/adapter/mcp/server.rs

Comment thread src/connector/adapter/mcp/server.rs
claude added 2 commits April 13, 2026 21:36
Make limit an Option<usize>: omit it to receive all results, or set it
to bound the response. Removes the MAX_QUERY_LIMIT=500 cap and the
default_query_limit=50 default — the caller decides what it needs.

https://claude.ai/code/session_01GoKejqoqkgxiug48Hp5LE8
- Replace QueryGraphInput::pattern: String with a typed QueryPattern enum
  (derive Deserialize/Serialize/JsonSchema, serde rename_all = snake_case)
  so invalid patterns are rejected at deserialization instead of at runtime;
  remove the unknown catch-all arm from the match.

- Fix tests_for path heuristic: use std::path::Path to inspect individual
  components (exact folder names "test"/"tests"/"spec"/"specs") and the file
  stem instead of a raw substring match, avoiding false positives like
  "contest.rs" or "inspect.rs".

- Drop the caller_file_path() fallback in the dedup iterator: when use_caller
  is true, entries with no caller_symbol are now filtered out (return None via
  ?) so GraphQueryNode.symbol always holds a real symbol, never a file path.

- Halve the per-query limit for inheritors_of and children_of: each of the two
  sub-queries (inheritance + implementation) now receives (limit+1)/2 rows so
  the combined pre-dedup result stays within the requested bound.

https://claude.ai/code/session_01GoKejqoqkgxiug48Hp5LE8
@ArtemisMucaj ArtemisMucaj merged commit 3d634e7 into main Apr 13, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants