[RFC] Expose ML Intern tools as a provider-neutral MCP server

## Context

ML Intern can already consume MCP servers, but it does not currently expose its own Hugging Face / ML tooling as an MCP server. I would like to use these tools from external coding agents such as Codex, Claude Code, Cursor, etc., while keeping the external agent as the model/session/auth runtime.

This is intentionally different from using Codex / ChatGPT OAuth tokens inside ML Intern. Direct OAuth/subscription support is being discussed in #59 and implemented experimentally in #253. This proposal avoids that boundary: ML Intern would provide tools; the host agent would provide the model, session, and auth runtime.

## Why this may fit the repo

The existing tool boundary is already close to an MCP-exportable surface:

- `ToolSpec` has a name, description, JSON schema parameters, and async handler.
- `ToolRouter` already converts MCP tools into OpenAI-style tool specs and can call both built-in and MCP-backed tools.
- `fastmcp` is already a project dependency.
- There has been related prior work in #92, #113, and #114, but those appear to be fork-specific, Claude Code-specific, or plugin-shaped rather than a small provider-neutral MCP surface.

## Proposal

Add a supported `ml-intern-mcp` command, or `ml-intern mcp-server`, that serves a curated subset of ML Intern tools over MCP stdio.

The first version could be deliberately small and read-only:

- `explore_hf_docs`
- `hf_docs_fetch`
- OpenAPI / HF API docs search if initialization succeeds
- `hf_papers`
- `hf_inspect_dataset`
- GitHub read/list/example-discovery tools

Explicitly out of scope for the first slice:

- Codex OAuth / ChatGPT token reuse
- Claude Code plugin packaging
- local filesystem tools such as `bash`, `read`, `write`, and `edit`
- mutating Hub tools, `hf_jobs`, and sandbox execution
- approval UX and trace-upload semantics

## Implementation sketch

- Create a small stdio MCP server entry point, for example `agent/mcp_server.py`.
- Reuse existing `ToolSpec` definitions and handlers instead of duplicating tool schemas.
- Preserve `ToolSpec.parameters` as the MCP input schema directly. This seems important because schema generation from Python function signatures can lose richer JSON schema shapes.
- Add a conservative allowlist for exported tools.
- Add tests that initialize the MCP server, list tools, and call one or two deterministic read-only tools with mocked network calls.
- Document Codex setup, for example:

  ```bash
  codex mcp add ml-intern -- uv run ml-intern-mcp
  ```

- Also document generic stdio MCP config for other clients.

## Questions

- Would maintainers want this as a small provider-neutral MCP server PR?
- Which tools should be in the first read-only allowlist?
- Preferred module and entry point location?
- Should the implementation use FastMCP or the lower-level MCP server API to preserve existing JSON schemas exactly?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Expose ML Intern tools as a provider-neutral MCP server #273

Context

Why this may fit the repo

Proposal

Implementation sketch

Questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[RFC] Expose ML Intern tools as a provider-neutral MCP server #273

Description

Context

Why this may fit the repo

Proposal

Implementation sketch

Questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions