An MCP server that fetches web pages and converts them to clean, readable Markdown.
This server takes a URL, fetches the page, and strips away everything you don't need — navigation, sidebars, banners, scripts — leaving just the main content as Markdown. It's perfect for feeding into LLMs, giving them the distilled essence of a page without the noise. It also recognizes GitHub, GitLab, Bitbucket, and Gist URLs and rewrites them to fetch the raw content directly.
By default it runs over stdio. Pass --http if you need a proper HTTP endpoint with auth, rate limiting, TLS, and session support.
- HTML to Markdown — Turns any public web page into clean, readable Markdown with metadata like
title,url,contentSize, andtruncated. - Smart URL handling — Recognizes GitHub, GitLab, Bitbucket, and Gist page URLs and rewrites them to raw-content endpoints before fetching.
- Task mode — Big or slow pages can run as async MCP tasks with progress updates, instead of blocking. In HTTP mode, tasks are bound to the authenticated caller rather than a single MCP session, so they can be resumed after reconnecting as the same authenticated subject. Polling task state exposes task summaries; numeric progress remains out-of-band via
notifications/progress. - Self-documenting — Includes an
internal://instructionsresource and aget-helpprompt so clients know how to use it. - HTTP mode — Optionally serves over Streamable HTTP with host/origin validation, bearer or OAuth auth, rate limiting, health checks, and TLS.
A browser-based client is available if you want to use the server without any MCP setup.
- Node.js >= 24
- Docker (optional) — only needed if you want to run the container image
Add this to your MCP client config:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}Install in VS Code
Add to .vscode/mcp.json:
{
"servers": {
"fetch-url-mcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}Or install via CLI:
code --add-mcp '{"name":"fetch-url-mcp","command":"npx","args":["-y","@j0hanz/fetch-url-mcp@latest"]}'For more info, see VS Code MCP docs.
Install in VS Code Insiders
Add to .vscode/mcp.json:
{
"servers": {
"fetch-url-mcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}Or install via CLI:
code-insiders --add-mcp '{"name":"fetch-url-mcp","command":"npx","args":["-y","@j0hanz/fetch-url-mcp@latest"]}'For more info, see VS Code Insiders MCP docs.
Install in Cursor
Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}For more info, see Cursor MCP docs.
Install in Visual Studio
For solution-scoped setup, add this to .mcp.json at the solution root:
{
"servers": {
"fetch-url-mcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}For more info, see Visual Studio MCP docs.
Install in Goose
Add to ~/.config/goose/config.yaml on macOS/Linux or %APPDATA%\Block\goose\config\config.yaml on Windows:
extensions:
fetch-url-mcp:
name: fetch-url-mcp
cmd: npx
args: ['-y', '@j0hanz/fetch-url-mcp@latest']
enabled: true
type: stdio
timeout: 300For more info, see Goose extension docs.
Install in LM Studio
Add to ~/.lmstudio/mcp.json on macOS/Linux or %USERPROFILE%/.lmstudio/mcp.json on Windows:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}For more info, see LM Studio MCP docs.
Install in Claude Desktop
Add to claude_desktop_config.json:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}For more info, see Claude Desktop MCP docs.
Install in Claude Code
Use the CLI:
claude mcp add fetch-url-mcp -- npx -y @j0hanz/fetch-url-mcp@latestFor project-scoped config, Claude Code writes .mcp.json with:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"],
"env": {}
}
}
}For more info, see Claude Code MCP docs.
Install in Windsurf
Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}For more info, see Windsurf MCP docs.
Install in Amp
Add to ~/.config/amp/settings.json on macOS/Linux, %USERPROFILE%\.config\amp\settings.json on Windows, or .amp/settings.json for workspace-scoped config:
{
"amp.mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}Or install via CLI:
amp mcp add fetch-url-mcp -- npx -y @j0hanz/fetch-url-mcp@latestFor more info, see Amp docs.
Install in Cline
Open the MCP Servers panel, choose Configure MCP Servers, and add this to cline_mcp_settings.json:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}For more info, see Cline MCP docs.
Install in Codex CLI
Use the CLI:
codex mcp add fetch-url-mcp -- npx -y @j0hanz/fetch-url-mcp@latestOr add this to ~/.codex/config.toml or project-scoped .codex/config.toml:
[mcp_servers.fetch-url-mcp]
command = "npx"
args = ["-y", "@j0hanz/fetch-url-mcp@latest"]For more info, see Codex MCP docs.
Install in GitHub Copilot
Add to .vscode/mcp.json:
{
"servers": {
"fetch-url-mcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}For more info, see GitHub Copilot MCP docs.
Install in Warp
Open Personal > MCP Servers in Warp, choose + Add, and either add a CLI server with:
command:npxargs:["-y", "@j0hanz/fetch-url-mcp@latest"]
Or paste this JSON snippet when using Warp's multi-server import flow:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}For more info, see Warp MCP docs.
Install in Kiro
Use Kiro's MCP Servers panel or the Add to Kiro install flow. Kiro stores workspace-scoped MCP config in .kiro/settings/mcp.json and user-scoped config in ~/.kiro/settings/mcp.json.
For this server, use:
command:npxargs:["-y", "@j0hanz/fetch-url-mcp@latest"]
For more info, see Kiro MCP docs.
Install in Gemini CLI
Add to ~/.gemini/settings.json:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}For more info, see Gemini CLI MCP docs.
Install in Zed
Add to ~/.config/zed/settings.json:
{
"context_servers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"],
"env": {}
}
}
}For more info, see Zed MCP docs.
Install in Augment
Use the Augment Settings panel and either add the server manually or choose Import from JSON:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}For more info, see Augment MCP docs.
Install in Roo Code
Use Roo Code's MCP Servers UI or marketplace flow.
For this server, use:
command:npxargs:["-y", "@j0hanz/fetch-url-mcp@latest"]
For more info, see Roo Code docs.
Install in Kilo Code
Use Kilo Code's MCP Servers UI or marketplace flow.
For this server, use:
command:npxargs:["-y", "@j0hanz/fetch-url-mcp@latest"]
For more info, see Kilo Code docs.
- Documentation for LLMs — Grab a docs page, blog post, or reference article as Markdown and pass it straight into a context window.
- Repository content — Hand it a GitHub, GitLab, or Bitbucket URL and it resolves the raw content endpoint. Works with Gists too.
- Slow or large pages — Task mode lets big fetches run in the background while sending monotonic progress updates back to the client, while
tasks/getexposes the latest task summary fields such asstatusMessage,createdAt,lastUpdatedAt,ttl, andpollInterval.
[MCP Client]
├─ stdio -> `src/index.ts` -> `startStdioServer()` -> `createMcpServer()`
└─ HTTP (`--http`) -> `src/index.ts` -> `startHttpServer()` -> HTTP dispatcher
├─ `GET /health`
├─ `GET /.well-known/oauth-protected-resource`
├─ `GET /.well-known/oauth-protected-resource/mcp`
└─ `POST|GET|DELETE /mcp`
`createMcpServer()`
├─ registers tool: `fetch-url`
├─ registers prompt: `get-help`
├─ registers resources:
│ - `internal://instructions`
├─ enables capabilities: logging, resources, prompts, tasks
└─ installs task handlers, log-level handling, and shutdown cleanup
`fetch-url` execution
├─ validate input with `fetchUrlInputSchema`
├─ normalize URL and block local/private targets unless allowed
├─ rewrite supported code-host URLs to raw endpoints when possible
├─ fetch content via the shared pipeline
├─ transform HTML into Markdown in the transform worker path
└─ validate `structuredContent` with `fetchUrlOutputSchema`
[Client] -- initialize {protocolVersion, capabilities} --> [Server]
[Server] -- {protocolVersion, capabilities, serverInfo} --> [Client]
[Client] -- notifications/initialized --> [Server]
[Client] -- tools/call {name, arguments} --> [Server]
[Server] -- {content: [{type, text}], structuredContent?, isError?} --> [Client]
Takes a URL and returns Markdown. Read-only, with no JavaScript execution. Supports running as a background MCP task for large or slow pages. In task mode, tasks/get and tasks/list expose task summaries including status, statusMessage, createdAt, lastUpdatedAt, ttl, and pollInterval; numeric progress remains out-of-band via notifications/progress.
| Parameter | Type | Required | Description |
|---|---|---|---|
url |
string |
yes | Target URL. Max 2048 chars. |
You get text content back by default. If output validation passes, the response also includes structuredContent with typed fields: url, resolvedUrl, finalUrl, title, metadata, markdown, fetchedAt, contentSize, and truncated. A true value for truncated means the content hit a server-side size limit.
To opt into progress updates, include _meta.progressToken in the tool call. The token may be a string or number, and the server may emit monotonic notifications/progress updates while the fetch runs.
To run the tool in task mode, include params.task = { ttl?: <ms> }. tasks/result waits until the task reaches a terminal status and then returns the stored output or a terminal error payload for cancelled or failed tasks. Task summaries and final results include _meta["io.modelcontextprotocol/related-task"] = { "taskId": "<server-task-id>" }.
{
"method": "tools/call",
"params": {
"name": "fetch-url",
"arguments": {
"url": "https://example.com/docs"
},
"task": {
"ttl": 300000,
"pollInterval": 1000
},
"_meta": {
"progressToken": 7
}
}
}1. [Client] -- tools/call {name: "fetch-url", arguments} --> [Server]
2. [Server] -- dispatch("fetch-url") --> [src/tools/fetch-url.ts]
3. [Handler] -- validate(fetchUrlInputSchema) --> normalize / fetch / transform
4. [Handler] -- validate(fetchUrlOutputSchema) --> assemble content + structuredContent
5. [Server] -- result or tool error --> [Client]
| Resource | URI | MIME Type | Description |
|---|---|---|---|
fetch-url-mcp-instructions |
internal://instructions |
text/markdown |
Guidance for using the Fetch URL MCP server. |
| Prompt | Arguments | Description |
|---|---|---|
get-help |
topic? |
Return Fetch URL server instructions: workflows, task mode, and error handling. Optional values: capabilities, workflows, constraints, errors. |
| Capability | Status | Notes |
|---|---|---|
| completions | confirmed | Advertised in createServerCapabilities(). |
| logging | confirmed | Advertised in createServerCapabilities(). |
| resources | confirmed | Static instruction resource is registered during server startup. Subscription and list-changed support are not advertised. |
| prompts | confirmed | get-help is registered during server startup. |
| tasks | confirmed | Advertised in createServerCapabilities() and backed by registered task handlers plus optional tool task support. |
| progress notifications | confirmed | Opt-in via _meta.progressToken. Tool execution reports monotonic notifications/progress updates during fetch and transform stages. |
| task status updates | confirmed | notifications/tasks/status is emitted when task status changes and TASKS_STATUS_NOTIFICATIONS=true. |
| Annotation | Value |
|---|---|
readOnlyHint |
true |
destructiveHint |
false |
idempotentHint |
true |
openWorldHint |
true |
The tool declares an outputSchema and includes structuredContent in the response when validation passes. Clients that support structured output get typed data directly; the rest use the text fallback.
All configuration is through environment variables. For basic stdio usage, nothing needs to be set.
| Variable | Default | Notes |
|---|---|---|
HOST |
127.0.0.1 |
Bind address. Non-loopback bindings also require ALLOW_REMOTE=true. |
PORT |
3000 |
Listening port for --http. |
ALLOW_REMOTE |
false |
Must be enabled to bind to a non-loopback interface. |
ALLOWED_HOSTS |
empty | Additional allowed Host and Origin values. |
SERVER_MAX_CONNECTIONS |
0 |
Optional connection cap. |
SERVER_TRUST_PROXY |
false |
Trust X-Forwarded-For / Forwarded for client IP resolution. |
SERVER_HEADERS_TIMEOUT_MS |
unset | Optional Node server tuning. |
SERVER_REQUEST_TIMEOUT_MS |
unset | Optional Node server tuning. |
SERVER_KEEP_ALIVE_TIMEOUT_MS |
unset | Optional keep-alive tuning. |
SERVER_KEEP_ALIVE_TIMEOUT_BUFFER_MS |
unset | Optional keep-alive tuning buffer. |
SERVER_MAX_HEADERS_COUNT |
unset | Optional header count limit. |
SERVER_BLOCK_PRIVATE_CONNECTIONS |
false |
Enables inbound private-network protections when not trusting a proxy. |
| Variable | Default | Notes |
|---|---|---|
ACCESS_TOKENS |
unset | Comma- or space-separated static bearer tokens. |
API_KEY |
unset | Alternate static token source for header auth. |
OAUTH_ISSUER_URL |
unset | Enables OAuth mode when combined with the other OAuth URLs. |
OAUTH_AUTHORIZATION_URL |
unset | Optional explicit authorization endpoint. |
OAUTH_TOKEN_URL |
unset | Optional explicit token endpoint. |
OAUTH_REVOCATION_URL |
unset | Optional OAuth revocation endpoint. |
OAUTH_REGISTRATION_URL |
unset | Optional OAuth dynamic client registration endpoint. |
OAUTH_INTROSPECTION_URL |
unset | Required for OAuth token introspection. |
OAUTH_REQUIRED_SCOPES |
empty | Required scopes enforced after auth. |
OAUTH_CLIENT_ID |
unset | Optional introspection client ID. |
OAUTH_CLIENT_SECRET |
unset | Optional introspection client secret. |
| Variable | Default | Notes |
|---|---|---|
SERVER_TLS_KEY_FILE |
unset | Enable HTTPS when set together with SERVER_TLS_CERT_FILE. |
SERVER_TLS_CERT_FILE |
unset | TLS certificate path. |
SERVER_TLS_CA_FILE |
unset | Optional custom CA bundle. |
| Variable | Default | Notes |
|---|---|---|
ALLOW_LOCAL_FETCH |
false |
Allows loopback and private-network fetch targets. |
FETCH_TIMEOUT_MS |
15000 |
Network fetch timeout in milliseconds. |
USER_AGENT |
fetch-url-mcp/<version> |
Override the outbound user agent string. |
| Variable | Default | Notes |
|---|---|---|
MAX_INLINE_CONTENT_CHARS |
0 |
0 means no explicit inline truncation limit. |
| Variable | Default | Notes |
|---|---|---|
TASKS_MAX_TOTAL |
5000 |
Total retained task capacity, including completed/cancelled tasks until they expire. |
TASKS_MAX_PER_OWNER |
1000 |
Per-owner retained task cap, clamped to the total cap. |
TASKS_STATUS_NOTIFICATIONS |
false |
Enables status notifications for tasks. |
TASKS_REQUIRE_INTERCEPTION |
true |
Requires interception for task-capable tool execution. |
| Variable | Default | Notes |
|---|---|---|
TRANSFORM_CANCEL_ACK_TIMEOUT_MS |
200 |
Cancellation acknowledgement timeout. |
TRANSFORM_WORKER_MODE |
threads |
Worker execution mode. |
TRANSFORM_WORKER_MAX_OLD_GENERATION_MB |
unset | Optional worker memory limit. |
TRANSFORM_WORKER_MAX_YOUNG_GENERATION_MB |
unset | Optional worker memory limit. |
TRANSFORM_WORKER_CODE_RANGE_MB |
unset | Optional worker memory limit. |
TRANSFORM_WORKER_STACK_MB |
unset | Optional worker stack size. |
| Variable | Default | Notes |
|---|---|---|
FETCH_URL_MCP_EXTRA_NOISE_TOKENS |
empty | Extra noise-removal tokens. |
FETCH_URL_MCP_EXTRA_NOISE_SELECTORS |
empty | Extra DOM selectors for noise removal. |
FETCH_URL_MCP_LOCALE |
system default | Locale override for extraction heuristics. |
MARKDOWN_HEADING_KEYWORDS |
built-in list | Override heading keywords used by cleanup. |
| Variable | Default | Notes |
|---|---|---|
LOG_LEVEL |
info |
debug, info, warn, or error. |
LOG_FORMAT |
text |
Set to json for structured logs. |
| Method | Path | Auth | Purpose |
|---|---|---|---|
GET |
/health |
no, unless ?verbose=1 on a remote server |
Basic health response, with optional diagnostics. |
GET |
/.well-known/oauth-protected-resource |
no | OAuth protected-resource metadata. |
GET |
/.well-known/oauth-protected-resource/mcp |
no | OAuth protected-resource metadata for the MCP endpoint. |
POST |
/mcp |
yes | Session initialization and JSON-RPC requests. |
GET |
/mcp |
yes | Session-bound server-to-client stream handling. |
DELETE |
/mcp |
yes | Session shutdown. |
| Control | Status | Notes |
|---|---|---|
| Host and origin validation | implemented | HTTP requests are rejected unless Host and Origin match the allowlist built from loopback, the configured host, and ALLOWED_HOSTS. |
| Authentication | implemented | HTTP mode supports static bearer tokens locally or OAuth token introspection; remote bindings require OAuth. |
| Protocol version checks | implemented | Session-bound MCP HTTP requests validate MCP-Protocol-Version and pin it to the negotiated session version. |
| Rate limiting | implemented | Requests pass through the HTTP rate limiter before route dispatch. Enable SERVER_TRUST_PROXY=true behind a trusted reverse proxy so limits apply to the forwarded client IP instead of the proxy hop. |
| Outbound SSRF protections | implemented | Local/private IPs, metadata endpoints, and .local/.internal hosts are blocked unless ALLOW_LOCAL_FETCH=true. |
| TLS | optional | HTTPS is enabled when both TLS key and certificate files are configured. |
| Stdio logging safety | implemented | Server logs are written to stderr, not stdout, so stdio MCP traffic stays clean. |
| Command | Description |
|---|---|
npm run build |
Clean, compile TypeScript, copy assets. |
npm run dev |
Watch mode TypeScript compilation. |
npm run dev:run |
Run the server with --watch and .env support. |
npm start |
Start the compiled server. |
npm test |
Run the full test suite. |
npm run lint |
Lint with ESLint. |
npm run lint:fix |
Auto-fix lint issues. |
npm run type-check |
Type-check source and tests. |
npm run format |
Format with Prettier. |
npm run inspector |
Build and launch MCP Inspector. |
All npm scripts
| Script | Command |
|---|---|
clean |
node scripts/tasks.mjs clean |
build |
node scripts/tasks.mjs build |
copy:assets |
node scripts/tasks.mjs copy:assets |
prepare |
npm run build |
dev |
tsc --watch --preserveWatchOutput |
dev:run |
node --env-file=.env --watch dist/index.js |
start |
node dist/index.js |
format |
prettier --write . |
type-check |
node scripts/tasks.mjs type-check |
type-check:src |
node node_modules/typescript/bin/tsc -p tsconfig.json --noEmit |
type-check:tests |
node node_modules/typescript/bin/tsc -p tsconfig.test.json --noEmit |
type-check:diagnostics |
tsc --noEmit --extendedDiagnostics |
type-check:trace |
node -e "require('fs').rmSync('.ts-trace',{recursive:true,force:true})" && tsc --noEmit --generateTrace .ts-trace |
lint |
eslint . |
lint:tests |
eslint src/__tests__ |
lint:fix |
eslint . --fix |
test |
node scripts/tasks.mjs test |
test:fast |
node --test --import tsx/esm src/__tests__/**/*.test.ts node-tests/**/*.test.ts |
test:coverage |
node scripts/tasks.mjs test --coverage |
knip |
knip |
knip:fix |
knip --fix |
inspector |
npm run build && npx -y @modelcontextprotocol/inspector node dist/index.js --stdio |
prepublishOnly |
npm run lint && npm run type-check && npm run build |
npm run prepublishOnlyruns lint, type-check, and build as a single release gate.- CI workflows are under
.github/workflows/. Dockerfileanddocker-compose.ymlare included for containerized runs.- Published on npm as
@j0hanz/fetch-url-mcp.
| Symptom | Likely Cause | Fix |
|---|---|---|
| Server output mixes with MCP traffic on stdio | Logs going to stdout | Ensure all logging writes to stderr; the server does this by default. |
HTTP mode returns 403 |
Host/origin mismatch | Add the domain to ALLOWED_HOSTS or verify loopback bindings. |
| HTTP mode rate limits every request from your proxy | SERVER_TRUST_PROXY not enabled |
Enable SERVER_TRUST_PROXY=true when the server is behind a trusted reverse proxy. |
HTTP mode returns 401 |
Missing or invalid token | Set ACCESS_TOKENS or configure OAuth env vars for remote bindings. |
| Fetch returns private-IP error | SSRF protections blocked the target | Set ALLOW_LOCAL_FETCH=true if the target is intentionally local. |
truncated: true in response |
Content exceeded inline limits | Increase MAX_INLINE_CONTENT_CHARS or accept truncated output. |
| Transform timeout or worker crash | Large or complex HTML | Tune TRANSFORM_WORKER_MAX_OLD_GENERATION_MB or increase FETCH_TIMEOUT_MS. |
| Client config not working | Wrong config format for the client | Check the matching <details> block above — config keys vary by client. |
| Dependency | Registry |
|---|---|
| @modelcontextprotocol/server | npm |
| @modelcontextprotocol/node | npm |
| @mozilla/readability | npm |
| linkedom | npm |
| node-html-markdown | npm |
| undici | npm |
| zod | npm |
Pull requests welcome. Please make sure these pass before submitting:
npm run lintandnpm run type-checknpm testnpm run format
MIT License. See LICENSE for details.