Skip to content

Egor/dev 1265 improve model introspection and help tool in mcp server#36

Merged
ZmeiGorynych merged 11 commits intomainfrom
egor/dev-1265-improve-model-introspection-and-help-tool-in-mcp-server
Apr 17, 2026
Merged

Egor/dev 1265 improve model introspection and help tool in mcp server#36
ZmeiGorynych merged 11 commits intomainfrom
egor/dev-1265-improve-model-introspection-and-help-tool-in-mcp-server

Conversation

@ZmeiGorynych
Copy link
Copy Markdown
Member

@ZmeiGorynych ZmeiGorynych commented Apr 16, 2026

Summary by CodeRabbit

  • New Features

    • Interactive conceptual help (CLI and MCP) with discoverable topics; MCP adds models_summary(datasource_name) and enhanced inspect_model(...) with format options and sample-data previews; describe_datasource now lists tables by default.
  • Bug Fixes

    • Validation now rejects numeric-only aggregations on string-backed measures with clear errors.
  • Documentation

    • Added API gaps page, extensive conceptual help topics, updated MCP/CLI docs, examples, notebook, and README/workflows.
  • Tests

    • New unit and integration tests covering help, MCP discovery/inspection, sampling, and aggregation validation.

ZmeiGorynych and others added 3 commits April 16, 2026 17:31
Introduce a conceptual-help surface that complements the schema-focused
tool docstrings: the ten things an agent needs to know that aren't in a
parameter description — colon syntax, the self-join vs window-function
transform trade-off, cross-model measure sub-queries, the three meanings
of "last", auto-routing of filters to WHERE/HAVING/post-filter, etc.

Topics: queries, formulas, aggregations, transforms, time, filters,
joins, models, extending, workflow. Content lives as one .md per topic
in slayer/help/topics/ and is discovered dynamically at module load —
adding a new topic is just dropping a new NN_name.md file in, no Python
changes. The NN_ prefix controls teaching order.

MCP: new help(topic=None) tool. The topic list is embedded in the tool
description so agents see it in the tool manifest without calling.
The FastMCP instructions string now points at help() upfront.

CLI: new `slayer help [TOPIC]` subcommand mirroring the MCP behaviour;
argparse's built-in -h/--help is unchanged.

Tests statically validate every ```json snippet in every help body —
SlayerQuery.model_validate for structure, parse_formula for field
formulas, and the engine's extract_filter_transforms + parse_filter
pipeline for filter strings. This caught two real doc-drift issues
(colon syntax in order.column, missing transform names in filter
pipeline) that are fixed in the help content here.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 16, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a file-backed, topic-driven conceptual help system surfaced via CLI (slayer help) and MCP (help()); replaces MCP datasource_summary with models_summary, refactors inspect_model to emit rich markdown/JSON with sampling and profiling, enforces numeric-only aggregation checks, and adds tests, docs, and an MCP introspection notebook.

Changes

Cohort / File(s) Summary
Help infrastructure
slayer/help/__init__.py, slayer/help/topics/*
New package that discovers topics/*.md and exposes render_help(), available_topic_names(), TOPIC_SUMMARY_LINE, INTRO_KEY, plus ~11 topic files (00_intro..10_workflow).
CLI
slayer/cli.py
Added help subcommand and _run_help() wiring to render help topics (uses TOPIC_SUMMARY_LINE).
MCP server & tools
slayer/mcp/server.py
Added help(topic=None) tool; removed datasource_summary and list_tables; added models_summary(datasource_name, format); updated describe_datasource(name, list_tables=True, schema_name=""); rewrote inspect_model to support `format=markdown
Aggregation validation
slayer/core/enums.py, slayer/engine/enrichment.py
Added NUMERIC_ONLY_AGGREGATIONS and validation in _ensure_aggregated_measure() to reject numeric-only aggregations on string-backed measures.
Tests
tests/test_help.py, tests/test_mcp_server.py, tests/integration/test_mcp_inspect.py
New unit/integration tests for help discovery/rendering, models_summary/inspect_model markdown behavior, dimension profiling, sample-query logic, reachable-field traversal, and numeric-aggregation validation.
Documentation & examples
docs/*, README.md, .claude/skills/*, docs/examples/.../mcp_introspect_nb.ipynb
New docs/api_gaps.md; updated CLI/MCP docs to reflect help, models_summary, describe_datasource params, and inspect_model output; added MCP introspection notebook and multiple help topic docs.
Minor edits
docs/interfaces/cli.md, docs/interfaces/mcp.md, docs/reference/mcp.md, docs/concepts/models.md, docs/getting-started/mcp.md
Documentation alignment for renamed/updated MCP tools and help topics; small content/format adjustments.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant MCP_Server
  participant Storage
  participant Engine
  participant Database

  Client->>MCP_Server: call inspect_model(model_name, num_rows?, format?)
  MCP_Server->>Storage: load model metadata (model_name)
  MCP_Server->>Engine: build sample query args (measures, agg choices)
  Engine->>Database: execute sample & profile queries
  Database-->>Engine: rows / errors
  Engine-->>MCP_Server: sample rows, dim profiles, row_count
  MCP_Server->>MCP_Server: render markdown or JSON (tables, pruned columns)
  MCP_Server-->>Client: return rendered document or JSON
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested reviewers

  • AivanF

Poem

🐰 I hopped through docs and topics bright,

CLI and MCP now guide the night.
Models summarized, samples in view,
Numeric checks keep counts true.
A rabbit's cheer — help blooms anew!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.88% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective: improving model introspection and the help tool in the MCP server, which is reflected throughout the comprehensive changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch egor/dev-1265-improve-model-introspection-and-help-tool-in-mcp-server

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (2)
slayer/cli.py (1)

249-249: Move new slayer.help imports to module top-level.

The new local imports work, but they violate the repo Python import placement rule.

Suggested refactor
 import argparse
 import os
 import sys
 
 from slayer.async_utils import run_sync
+from slayer.help import TOPIC_SUMMARY_LINE, render_help
@@
-    from slayer.help import TOPIC_SUMMARY_LINE
-
     help_parser = subparsers.add_parser(
@@
 def _run_help(args):
-    from slayer.help import render_help
-
     print(render_help(topic=args.topic))

As per coding guidelines "**/*.py: Place imports at the top of files".

Also applies to: 292-292

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@slayer/cli.py` at line 249, The new local imports from slayer.help (e.g.
TOPIC_SUMMARY_LINE) should be moved to the module top-level alongside the other
imports instead of being imported inside functions; locate the places where
TOPIC_SUMMARY_LINE (and any other slayer.help names added) are imported locally
and remove those in-function imports, add a single top-level "from slayer.help
import TOPIC_SUMMARY_LINE" with the other imports, and update any references to
use that top-level symbol.
slayer/help/__init__.py (1)

66-73: Guard against silent topic-key collisions during discovery.

If two files normalize to the same key, later entries overwrite earlier ones silently. Please fail fast on duplicate derived keys (and duplicate intro) to avoid hidden docs regressions.

Suggested hardening
         key = _strip_numeric_prefix(stem)
         body = entry.read_text(encoding="utf-8").rstrip() + "\n"
         if key == INTRO_KEY:
+            if intro_body:
+                raise ValueError("Duplicate intro topic detected in slayer/help/topics")
             intro_body = body
         else:
+            if key in bodies:
+                raise ValueError(f"Duplicate help topic key detected: '{key}'")
             bodies[key] = body
             order.append(key)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@slayer/help/__init__.py` around lines 66 - 73, The discovery loop currently
lets later files overwrite earlier ones when _strip_numeric_prefix(stem) yields
duplicate keys (and also allows duplicate INTRO_KEY to be overwritten); modify
the logic in the block that computes key and body (the code using
_strip_numeric_prefix, INTRO_KEY, intro_body, bodies, order) to detect
duplicates and fail fast: if key == INTRO_KEY and intro_body is already set,
raise an error (e.g., ValueError) naming the duplicate entry; if key !=
INTRO_KEY and key already exists in bodies, raise an error indicating the
duplicate derived key and the conflicting filename(s) instead of overwriting;
ensure the raised error includes the stem or entry path so callers can locate
the offending files.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/api_gaps.md`:
- Around line 17-25: The feature matrix in docs/api_gaps.md incorrectly flags
CLI support for several actions; update the table rows for "Create datasource",
"Delete datasource", "Test datasource connection", "Ingest: include specific
tables", and "Ingest: exclude specific tables" to mark CLI as supported (change
the CLI column from — to Y) and adjust any notes (e.g., keep "Test datasource
connection" note that it's included in `describe_datasource`, and preserve the
REST-only note for `exclude_tables` if still applicable) so the table accurately
reflects existing CLI commands/flags.

In `@slayer/help/topics/03_aggregations.md`:
- Around line 64-74: The documentation is inconsistent about the
custom-aggregation placeholder: the prose says `{expr}` but the example YAML
uses `{value}`; pick one and make both match. Update the narrative sentence that
currently mentions `{expr}` and/or the `formula:` line in the `aggregations`
snippet for `trimmed_mean` so they both use the same placeholder (e.g., change
`{value}` to `{expr}` throughout or vice versa) and ensure the `params:` section
still references the correct parameter names (`lo`, `hi`) accordingly.

In `@slayer/help/topics/05_time.md`:
- Around line 59-62: The documentation incorrectly states that
main_time_dimension is always highest priority; update the bullet list to
reflect actual behavior: when a query has a single entry in time_dimensions that
dimension is used automatically for time-based functions; when the query has 2+
time_dimensions, main_time_dimension is used to disambiguate (or the model's
default_time_dimension if that default is present among the query's
time_dimensions); if the query has no time_dimensions at all, fall back to the
model's default_time_dimension. Reference the symbols main_time_dimension,
time_dimensions, and default_time_dimension in the revised bullets.

In `@slayer/help/topics/10_workflow.md`:
- Around line 8-11: Add language identifiers (e.g., "text") to the fenced code
blocks that currently start with numbered lists such as the block containing "1.
datasource_summary() ... 2. inspect_model(...)" and the other blocks that start
with "1. create_datasource(...)" and "1. create_datasource(...,
auto_ingest=false) ..." (also update the similar blocks at the other occurrences
noted). Replace each opening triple backtick (```) with a language-tagged
backtick fence (e.g., ```text) so all fenced code blocks have a language
identifier and markdownlint MD040 is satisfied.
- Around line 30-35: The example for create_datasource uses the wrong parameter
name; change the call to pass driver="postgres" instead of type="postgres" so it
matches the MCP signature (update the create_datasource(...) example in this
file), and scan for other occurrences of type="..." in this document to replace
with driver="..." where the MCP create_datasource() or similar functions are
shown (ensure the parameter name matches the function signature).

In `@slayer/mcp/server.py`:
- Around line 574-589: The public function inspect_model currently returns
markdown and dropped the show_sql parameter, breaking existing JSON consumers;
restore backward compatibility by adding a format parameter (e.g., format: str =
"markdown" with allowed "json"|"markdown") and accept the deprecated show_sql
argument (e.g., show_sql: Optional[bool] = None) in inspect_model, implementing
logic that: if show_sql is provided, map it to format="json" for compatibility
(while logging/deprecating), otherwise honor format; keep the existing markdown
generation path and add a JSON serialization branch that produces the prior
machine-readable payload shape (reuse the same internal helpers that build model
metadata, dimensions, measures, joins, profiles, and sample rows) so callers
requesting "json" or passing show_sql continue to work.
- Around line 109-142: _build_sample_query_args currently picks any allowed
aggregation or falls back to "avg", which can pick aggregations that require
extra args or context; change it to only select zero-argument, context-free
aggs. Define a safe set of aggs (e.g.
{"avg","sum","min","max","count","count_distinct"}), and when processing each
measure use these rules: if m.allowed_aggregations is not None, pick the first
entry from m.allowed_aggregations that is in the safe set (skip the measure if
none match); if m.allowed_aggregations is None, consult dim_types: if
dim_types.get(m.name) is one of ("string","boolean","date","time") choose
"count_distinct", otherwise pick the first available numeric-safe agg from
["avg","sum","min","max"] (skip if none available); append only when an agg was
chosen to fields. This targets the logic inside _build_sample_query_args,
referencing model.measures, m.allowed_aggregations, dim_types and fields.

---

Nitpick comments:
In `@slayer/cli.py`:
- Line 249: The new local imports from slayer.help (e.g. TOPIC_SUMMARY_LINE)
should be moved to the module top-level alongside the other imports instead of
being imported inside functions; locate the places where TOPIC_SUMMARY_LINE (and
any other slayer.help names added) are imported locally and remove those
in-function imports, add a single top-level "from slayer.help import
TOPIC_SUMMARY_LINE" with the other imports, and update any references to use
that top-level symbol.

In `@slayer/help/__init__.py`:
- Around line 66-73: The discovery loop currently lets later files overwrite
earlier ones when _strip_numeric_prefix(stem) yields duplicate keys (and also
allows duplicate INTRO_KEY to be overwritten); modify the logic in the block
that computes key and body (the code using _strip_numeric_prefix, INTRO_KEY,
intro_body, bodies, order) to detect duplicates and fail fast: if key ==
INTRO_KEY and intro_body is already set, raise an error (e.g., ValueError)
naming the duplicate entry; if key != INTRO_KEY and key already exists in
bodies, raise an error indicating the duplicate derived key and the conflicting
filename(s) instead of overwriting; ensure the raised error includes the stem or
entry path so callers can locate the offending files.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7a81759e-2561-4e55-8c8b-d239dbcd1027

📥 Commits

Reviewing files that changed from the base of the PR and between 984bf5c and 9179dc7.

📒 Files selected for processing (23)
  • docs/api_gaps.md
  • docs/interfaces/cli.md
  • docs/interfaces/mcp.md
  • docs/reference/mcp.md
  • slayer/cli.py
  • slayer/core/enums.py
  • slayer/engine/enrichment.py
  • slayer/help/__init__.py
  • slayer/help/topics/00_intro.md
  • slayer/help/topics/01_queries.md
  • slayer/help/topics/02_formulas.md
  • slayer/help/topics/03_aggregations.md
  • slayer/help/topics/04_transforms.md
  • slayer/help/topics/05_time.md
  • slayer/help/topics/06_filters.md
  • slayer/help/topics/07_joins.md
  • slayer/help/topics/08_models.md
  • slayer/help/topics/09_extending.md
  • slayer/help/topics/10_workflow.md
  • slayer/mcp/server.py
  • tests/integration/test_mcp_inspect.py
  • tests/test_help.py
  • tests/test_mcp_server.py

Comment thread docs/api_gaps.md Outdated
Comment thread slayer/help/topics/03_aggregations.md Outdated
Comment thread slayer/help/topics/05_time.md Outdated
Comment thread slayer/help/topics/10_workflow.md Outdated
Comment thread slayer/help/topics/10_workflow.md
Comment thread slayer/mcp/server.py
Comment thread slayer/mcp/server.py Outdated
describe_datasource absorbs list_tables — new `list_tables: bool = True`
and `schema_name: str = ""` args append the table listing to the existing
connection probe, so one call shows connection health + schemas + tables.
The standalone `list_tables` MCP tool is removed.

datasource_summary() is replaced by models_summary(datasource_name). The
old catalogue-everything JSON is gone; the new tool returns a brief
markdown overview of just the requested datasource's non-hidden models
(name, description, dim/measure name+description tables, list of joined
model names) — much lighter than inspect_model and a better picking tool.

inspect_model folds the per-dimension data profile into the Dimensions
table as a new `sampled` column (distinct values for string/boolean,
`min .. max` for number/date/time, `> 20 distinct` for overflow). The
standalone "Dimension profile (sampled)" section is removed.

_markdown_table auto-prunes columns whose cells are all empty/None and
collapses to a comma-separated backticked list when only one column
survives. On an auto-ingested order_items this shrinks the Measures table
from 6 columns to 3 and turns the models_summary dim/measure sections
into dense name lists rather than tables of em-dashes.

Docs + help topics + notebook intro/summary updated everywhere the old
tool names or profile section appeared; 08_mcp_introspect notebook
newly tracked with its current executed output reflecting all four
changes. 677 tests pass; ruff clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
slayer/mcp/server.py (1)

163-196: ⚠️ Potential issue | 🟡 Minor

Sample aggregation selection improved but still has edge cases.

The implementation now correctly handles string/boolean/date/time types by falling back to count_distinct. However, when allowed_aggregations is restricted and avg is not in the list, the code picks allowed[0] which could be a custom aggregation requiring parameters (like weighted_avg), causing the sample query to fail.

Consider filtering to only safe zero-arg aggregations:

Suggested improvement
+SAFE_SAMPLE_AGGS = {"avg", "sum", "count", "count_distinct", "min", "max"}
+
 def _build_sample_query_args(model: SlayerModel, num_rows: int) -> Dict[str, Any]:
     ...
     for m in model.measures:
         if m.hidden:
             continue
         allowed = m.allowed_aggregations
         if allowed is not None and "avg" not in allowed:
             if not allowed:
                 continue
-            agg = allowed[0]
+            safe = [a for a in allowed if a in SAFE_SAMPLE_AGGS]
+            if not safe:
+                continue
+            agg = safe[0]
         else:
             ...
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@slayer/mcp/server.py` around lines 163 - 196, The sample-aggregation logic in
_build_sample_query_args can pick a parameterized custom aggregation from
m.allowed_aggregations (e.g., "weighted_avg") which will break the generated
query; update the branch that handles a restricted allowed list to first filter
m.allowed_aggregations to a whitelist of safe zero-argument aggs (e.g.
{"count","count_distinct","sum","min","max","avg"}) and then pick the first
entry from that filtered list, and if none remain skip the measure; keep the
existing avg-vs-count_distinct type guard for the avg case in the other branch.
🧹 Nitpick comments (3)
tests/test_mcp_server.py (1)

315-318: Verify expected behavior for CRLF escaping.

The test expects \r\n to become " " (double space) because \r and \n are each replaced with a space independently. This appears intentional based on the _escape_md_cell implementation (line 109 in server.py: .replace("\r", " ").replace("\n", " ")), but the comment could clarify this produces double-space for CRLF sequences.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_mcp_server.py` around lines 315 - 318, The test's expectation that
"\r\n" becomes two spaces should be clarified: update the _escape_md_cell
implementation's comment (near the .replace("\r", " ").replace("\n", " ") calls
inside function _escape_md_cell) or add an inline test comment explaining that
CRLF sequences produce a double space because CR and LF are replaced
independently; ensure the test case comment reflects this intended behavior so
reviewers understand the double-space is deliberate.
docs/examples/08_mcp_introspect/mcp_introspect_nb.ipynb (1)

51-63: Consider using a relative import pattern or package installation.

The sys.path.insert manipulation for importing setup_jaffle works but is fragile. If the notebook is moved or the directory structure changes, it will break silently.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/examples/08_mcp_introspect/mcp_introspect_nb.ipynb` around lines 51 -
63, Replace the fragile sys.path.insert hack by importing setup_jaffle as a
proper package or using a relative-package import: ensure the module providing
ensure_jaffle_shop is installed or the notebook is run with the project package
on PYTHONPATH, then replace the sys.path manipulation and plain import of
setup_jaffle / ensure_jaffle_shop with a standard import (e.g., from
your_project.setup_jaffle import ensure_jaffle_shop) or install the package into
the environment so the call to ensure_jaffle_shop() and references to engine,
storage, models work reliably without modifying sys.path at runtime.
slayer/mcp/server.py (1)

300-387: Consider adding timeout or query limit for dimension profiling.

The _collect_dim_profile function runs one query per categorical dimension plus one batched query for numeric/temporal dimensions. On models with many dimensions or large tables, this could be slow.

The max_dims=10 cap helps, but consider adding a per-query timeout or row limit to prevent long-running profile queries from blocking inspect_model.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@slayer/mcp/server.py` around lines 300 - 387, The profiling can hang on slow
queries; update _collect_dim_profile to enforce per-query timeouts and stricter
limits: wrap calls to engine.execute in a bounded timeout (e.g. via
asyncio.wait_for) and make the numeric/temporal batched SlayerQuery include an
explicit LIMIT (or otherwise constrain the query) so it only returns the single
summary row; do this for both the categorical loop (where you already set
"limit": max_values+1) and the numeric/temporal batched query, and surface the
timeout as a configurable parameter (e.g. profile_timeout_secs) so callers of
_collect_dim_profile can control it while preserving the existing
exception-swallowing behavior around engine.execute.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@slayer/mcp/server.py`:
- Around line 163-196: The sample-aggregation logic in _build_sample_query_args
can pick a parameterized custom aggregation from m.allowed_aggregations (e.g.,
"weighted_avg") which will break the generated query; update the branch that
handles a restricted allowed list to first filter m.allowed_aggregations to a
whitelist of safe zero-argument aggs (e.g.
{"count","count_distinct","sum","min","max","avg"}) and then pick the first
entry from that filtered list, and if none remain skip the measure; keep the
existing avg-vs-count_distinct type guard for the avg case in the other branch.

---

Nitpick comments:
In `@docs/examples/08_mcp_introspect/mcp_introspect_nb.ipynb`:
- Around line 51-63: Replace the fragile sys.path.insert hack by importing
setup_jaffle as a proper package or using a relative-package import: ensure the
module providing ensure_jaffle_shop is installed or the notebook is run with the
project package on PYTHONPATH, then replace the sys.path manipulation and plain
import of setup_jaffle / ensure_jaffle_shop with a standard import (e.g., from
your_project.setup_jaffle import ensure_jaffle_shop) or install the package into
the environment so the call to ensure_jaffle_shop() and references to engine,
storage, models work reliably without modifying sys.path at runtime.

In `@slayer/mcp/server.py`:
- Around line 300-387: The profiling can hang on slow queries; update
_collect_dim_profile to enforce per-query timeouts and stricter limits: wrap
calls to engine.execute in a bounded timeout (e.g. via asyncio.wait_for) and
make the numeric/temporal batched SlayerQuery include an explicit LIMIT (or
otherwise constrain the query) so it only returns the single summary row; do
this for both the categorical loop (where you already set "limit": max_values+1)
and the numeric/temporal batched query, and surface the timeout as a
configurable parameter (e.g. profile_timeout_secs) so callers of
_collect_dim_profile can control it while preserving the existing
exception-swallowing behavior around engine.execute.

In `@tests/test_mcp_server.py`:
- Around line 315-318: The test's expectation that "\r\n" becomes two spaces
should be clarified: update the _escape_md_cell implementation's comment (near
the .replace("\r", " ").replace("\n", " ") calls inside function
_escape_md_cell) or add an inline test comment explaining that CRLF sequences
produce a double space because CR and LF are replaced independently; ensure the
test case comment reflects this intended behavior so reviewers understand the
double-space is deliberate.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: dc065658-a1d9-4f16-8940-9cc58daed376

📥 Commits

Reviewing files that changed from the base of the PR and between 9179dc7 and 69e714d.

📒 Files selected for processing (13)
  • .claude/skills/slayer-overview.md
  • README.md
  • docs/api_gaps.md
  • docs/concepts/models.md
  • docs/examples/08_mcp_introspect/mcp_introspect_nb.ipynb
  • docs/getting-started/mcp.md
  • docs/interfaces/mcp.md
  • docs/reference/mcp.md
  • slayer/help/topics/08_models.md
  • slayer/help/topics/10_workflow.md
  • slayer/mcp/server.py
  • tests/integration/test_mcp_inspect.py
  • tests/test_mcp_server.py
✅ Files skipped from review due to trivial changes (5)
  • docs/concepts/models.md
  • README.md
  • docs/api_gaps.md
  • slayer/help/topics/10_workflow.md
  • slayer/help/topics/08_models.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/integration/test_mcp_inspect.py

📝 CodeRabbit Chat: Implement requested code changes
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
docs/interfaces/mcp.md (1)

133-156: Add language specifiers to fenced code blocks.

Same as in docs/reference/mcp.md, these workflow example blocks should have language specifiers (e.g., text or plaintext) for linting compliance.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/interfaces/mcp.md` around lines 133 - 156, The fenced example blocks in
docs/interfaces/mcp.md lack language specifiers; update each triple-backtick
fence that wraps the workflows and query examples to include a plain-text
specifier (e.g., ```text or ```plaintext) so linting passes — specifically
locate the blocks showing create_datasource, models_summary, inspect_model,
describe_datasource, ingest_datasource_models, list_datasources and query and
add the language tag to the opening ``` for each block.
slayer/mcp/server.py (1)

424-425: Consider using collections.deque for BFS queue.

queue.pop(0) is O(n) for a list. While the max_depth=5 cap keeps n small, using deque.popleft() is the idiomatic O(1) choice for BFS.

🔧 Suggested fix
+from collections import deque
+
 async def _collect_reachable_fields(
     model: SlayerModel,
     storage: StorageBackend,
     *,
     max_depth: int = 5,
 ) -> Tuple[List[str], List[str]]:
     ...
     reachable_dims: set[str] = set()
     reachable_measures: set[str] = set()
     visited: set[str] = set()
-    queue: List[Tuple[str, str]] = []  # (full_path, target_model_name)
+    queue: deque[Tuple[str, str]] = deque()  # (full_path, target_model_name)
     ...
     for j in model.joins:
         path = _derive_path("", j)
         if path not in visited:
-            queue.append((path, j.target_model))
+            queue.append((path, j.target_model))  # append is same for deque

     while queue:
-        path, target_name = queue.pop(0)
+        path, target_name = queue.popleft()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@slayer/mcp/server.py` around lines 424 - 425, The BFS loop currently uses a
list and calls queue.pop(0) (in the while queue: / path, target_name =
queue.pop(0) block), which is O(n); switch to collections.deque to get O(1)
popleft: add "from collections import deque" at the top, initialize queue =
deque(...) wherever the list is created, and replace queue.pop(0) with
queue.popleft(). Ensure any code that appends uses queue.append(...) (or
queue.appendleft(...) if intended) and keep the rest of the BFS logic unchanged.
docs/reference/mcp.md (1)

115-138: Add language specifiers to fenced code blocks.

The code blocks in the workflow examples are missing language specifiers. While these are pseudo-code/prose examples, adding a specifier (e.g., text or plaintext) satisfies linting and improves rendering consistency.

🔧 Suggested fix
-```
+```text
 1. create_datasource(name="mydb", type="postgres", host="localhost", database="app", username="user", password="pass")
    # auto_ingest=true by default — models are generated automatically
 2. models_summary(datasource_name="mydb")      # see what was generated
 3. inspect_model(model_name="orders")          # see schema + sample data

Apply the same to the other unlabeled code blocks (lines 125, 133).
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/reference/mcp.md around lines 115 - 138, Add language specifiers (e.g.,
text or plaintext) to each fenced code block that contains the workflow examples
so linting renders them consistently; update the three unlabeled blocks showing
the sequences with functions like create_datasource, models_summary,
inspect_model, describe_datasource, ingest_datasource_models, list_datasources
and query by changing the opening triple-backtick lines to include a language
tag (for example ```text).


</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @slayer/mcp/server.py:

  • Around line 143-152: The single-column rendering block (when len(kept) == 1)
    can produce invalid Markdown if a cell contains backticks; in the loop that
    builds text for each row (variables col, rows and helper _cell_is_present),
    escape backticks before wrapping the value in inline code by adding a
    replacement like text = text.replace("", "\\") (i.e. include backtick escaping
    in the existing replace chain) so values with characters are safe when returned as f"{text}`". Ensure this change is applied where rendered is built
    and return remains ", ".join(rendered).

Nitpick comments:
In @docs/interfaces/mcp.md:

  • Around line 133-156: The fenced example blocks in docs/interfaces/mcp.md lack
    language specifiers; update each triple-backtick fence that wraps the workflows
    and query examples to include a plain-text specifier (e.g., ```text or
create_datasource, models_summary, inspect_model, describe_datasource,
ingest_datasource_models, list_datasources and query and add the language tag to
the opening ``` for each block.

In `@docs/reference/mcp.md`:
- Around line 115-138: Add language specifiers (e.g., text or plaintext) to each
fenced code block that contains the workflow examples so linting renders them
consistently; update the three unlabeled blocks showing the sequences with
functions like create_datasource, models_summary, inspect_model,
describe_datasource, ingest_datasource_models, list_datasources and query by
changing the opening triple-backtick lines to include a language tag (for
example ```text).

In `@slayer/mcp/server.py`:
- Around line 424-425: The BFS loop currently uses a list and calls queue.pop(0)
(in the while queue: / path, target_name = queue.pop(0) block), which is O(n);
switch to collections.deque to get O(1) popleft: add "from collections import
deque" at the top, initialize queue = deque(...) wherever the list is created,
and replace queue.pop(0) with queue.popleft(). Ensure any code that appends uses
queue.append(...) (or queue.appendleft(...) if intended) and keep the rest of
the BFS logic unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 15d4da1e-2ce9-4c5b-a9e7-4bd491880a18

📥 Commits

Reviewing files that changed from the base of the PR and between 69e714d and 497f0c1.

📒 Files selected for processing (4)
  • docs/api_gaps.md
  • docs/interfaces/mcp.md
  • docs/reference/mcp.md
  • slayer/mcp/server.py
✅ Files skipped from review due to trivial changes (1)
  • docs/api_gaps.md

ZmeiGorynych and others added 2 commits April 17, 2026 10:50
- docs/api_gaps.md: correct 4 CLI capability flags (datasource
  create/delete/test and ingest --exclude all exist in the CLI)
- help/topics/03_aggregations.md: fix placeholder name {expr} → {value}
  to match BUILTIN_AGGREGATION_FORMULAS
- help/topics/05_time.md: rewrite time-dimension priority bullets to
  match actual behavior (single dim used directly; main_time_dimension
  only disambiguates when 2+ are present; 0 dims falls back to model
  default)
- help/topics/10_workflow.md: add ```text language tags to bare fenced
  blocks (markdownlint MD040)
- mcp/server.py: prefer safe zero-arg aggregations (avg/sum/min/max/
  count/count_distinct/median) in sample-data extraction before falling
  back to parametrized ones; escape backticks in single-column markdown
  table rendering to prevent broken formatting
- tests: 3 new cases covering the safe-agg preference, parametrized
  fallback, and backtick escaping

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-in-mcp-server' of https://github.com/MotleyAI/slayer into egor/dev-1265-improve-model-introspection-and-help-tool-in-mcp-server
@ZmeiGorynych ZmeiGorynych marked this pull request as ready for review April 17, 2026 09:00
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_mcp_server.py`:
- Around line 336-339: The test asserts a CRLF ("line1\r\nline2") collapses to
two spaces which locks in an implementation quirk; update the test for
test_escape_pipes_and_newlines to expect a single space for CRLF (i.e. the same
result as "\n") so it matches the documented behavior of _escape_md_cell and
won't mark a future fix as a regression; locate the assertions in
test_escape_pipes_and_newlines and change the expected value for
"line1\r\nline2" to "line1 line2".
- Around line 240-242: The tests assume each element of args["dimensions"] is a
dict but _build_sample_query_args returns a flat list[str]; update the
assertions in tests/test_mcp_server.py to compare the dimensions list directly
(e.g., assert args["dimensions"] == ["status", "region"]) and similarly change
the other assertion pair at the later lines (around 294-295) to match the actual
shape returned by _build_sample_query_args so the tests no longer index strings
as dicts.
- Around line 439-460: The test test_auto_ingested_multi_hop_path expects a
BFS-derived multi-hop path "orders.customers.name" but the current helper in
slayer/mcp/server.py queues root joins as their direct target and only runs
_derive_path when traversing a target model's own joins, so update the test to
match current behavior: call _collect_reachable_fields with the root as before
and assert for "orders.customer_id" and "customers.name" (not
"orders.customers.name"), or alternatively modify the helper to invoke
_derive_path for root joins that reference dotted join_pairs (ModelJoin with
"orders.customer_id") so multi-hop paths are produced; pick the former (test
change) by replacing the failing assertion with assert "customers.name" in dims
and keeping assert "orders.customer_id" in dims.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 62bfaeca-54b9-44d8-beba-fbb6072ddce8

📥 Commits

Reviewing files that changed from the base of the PR and between 497f0c1 and 9a1323f.

📒 Files selected for processing (6)
  • docs/api_gaps.md
  • slayer/help/topics/03_aggregations.md
  • slayer/help/topics/05_time.md
  • slayer/help/topics/10_workflow.md
  • slayer/mcp/server.py
  • tests/test_mcp_server.py
✅ Files skipped from review due to trivial changes (4)
  • docs/api_gaps.md
  • slayer/help/topics/03_aggregations.md
  • slayer/help/topics/05_time.md
  • slayer/help/topics/10_workflow.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • slayer/mcp/server.py

Comment thread tests/test_mcp_server.py
Comment thread tests/test_mcp_server.py Outdated
Comment thread tests/test_mcp_server.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/test_mcp_server.py (1)

49-127: Add direct coverage for the new JSON branches.

These tests exercise the markdown contract well, but neither models_summary(..., format="json") nor inspect_model(..., format="json") is asserted here. A small happy-path JSON test for each would lock down the new public surface and would have caught the current inspect_model sample-data omission immediately.

Also applies to: 134-205

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_mcp_server.py` around lines 49 - 127, Add explicit JSON-format
tests for the new branches by adding two small happy-path tests (e.g.,
test_models_summary_json and test_inspect_model_json) that call
models_summary(...) and inspect_model(...) via _call with format="json" and
assert the returned payload is a parsed JSON object with the expected top-level
keys (e.g., datasource name, models list, model details), counts and
presence/absence of hidden models; ensure each test sets up the same fixtures
used elsewhere (use YAMLStorage.save_datasource and save_model/SlayerModel with
required fields) and for inspect_model include the minimal sample data fields
the implementation expects so the JSON response contains the expected fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@slayer/mcp/server.py`:
- Around line 911-1005: inspect_model(format="json") currently returns only
metadata and drops the sample data assembled earlier (sample_result, sample_sql,
sample_section). Modify the JSON return payload (the dict inside the json.dumps)
to include the sample information: add fields like "sample": {"columns":
sample_result.columns, "rows": sample_result.data} and, when show_sql is true
and sample_sql is set, include "sample_sql": sample_sql (or include it under
"sample.sql"). Use the already-populated variables sample_result and sample_sql
(and show_sql) so the JSON output mirrors the markdown path instead of
discarding the sample_section.
- Around line 147-156: The current inline-code rendering in the single-column
branch builds code spans unsafely (see the len(kept)==1 block and the other
usage around line 294); create a shared helper (e.g.,
render_markdown_code_span(value: str)) and use it from both places: the helper
should keep the existing sanitization (replace("|", "\\|"), normalize CR/LF,
strip) then compute the longest contiguous backtick run in the sanitized value,
build a fence of backticks one longer than that run (fence = "`" * (max_run +
1)), and return the value wrapped in that fence (fence + value + fence); replace
the current backtick-replacement and direct-wrapping code with calls to this
helper in both locations (the len(kept)==1 block and the other renderer at/near
line 294).

---

Nitpick comments:
In `@tests/test_mcp_server.py`:
- Around line 49-127: Add explicit JSON-format tests for the new branches by
adding two small happy-path tests (e.g., test_models_summary_json and
test_inspect_model_json) that call models_summary(...) and inspect_model(...)
via _call with format="json" and assert the returned payload is a parsed JSON
object with the expected top-level keys (e.g., datasource name, models list,
model details), counts and presence/absence of hidden models; ensure each test
sets up the same fixtures used elsewhere (use YAMLStorage.save_datasource and
save_model/SlayerModel with required fields) and for inspect_model include the
minimal sample data fields the implementation expects so the JSON response
contains the expected fields.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 423b610a-0893-4408-9d09-223cf5c2f852

📥 Commits

Reviewing files that changed from the base of the PR and between 9a1323f and 9bf97ea.

📒 Files selected for processing (2)
  • slayer/mcp/server.py
  • tests/test_mcp_server.py

Comment thread slayer/mcp/server.py
Comment thread slayer/mcp/server.py
…ev-1265-improve-model-introspection-and-help-tool-in-mcp-server

# Conflicts:
#	README.md
#	slayer/mcp/server.py
@ZmeiGorynych ZmeiGorynych merged commit 983e433 into main Apr 17, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant