DocIngest is the open-source engine for turning documentation sites into searchable, MCP-accessible context for humans and coding agents.
It crawls docs, stores them as clean markdown, indexes them for search, and exposes the same corpus through a web UI, CLI, and MCP server. Use it to build a public docs index, self-host an internal corpus, or give coding agents fresher documentation context.
Quick Start • MCP + CLI • Screenshots • Setup Docs • Contributing
- ✅ Index documentation sites from the web UI
- ✅ Browse and search indexed docs at
docingest.com - ✅ Open docs by domain, copy markdown, and download stored docs
- ✅ Re-index sources when upstream docs change
- ✅ Query docs from MCP-compatible coding tools
- ✅ Use the package as a lightweight CLI for quick lookup
- 📚 The live
maindeployment currently serves 1,512 latest documentation sites ondocingest.comas of April 24, 2026 - 🗂️ DocIngest stores versioned snapshots per domain, so one docs site can have multiple historical versions behind the scenes
- ℹ️ The Git repository does not commit the full hosted corpus; the deployed service holds the actual indexed docs data
- 🧪 Search/ranking works, but needs deeper tuning
- 🧪 Loading, empty, and success states need more polish
- 🧪 Version-aware storage exists, but the product UX around versions is still early
- ❌ Not yet a mature enterprise docs platform with permissions, collaboration, and admin workflows
- Node.js 18+ or Bun
- Firecrawl, hosted or self-hosted
- Redis for fast autocomplete/search
Redis is optional for tiny local tests, but recommended for anything serious.
git clone https://github.com/Amal-David/docingest.git
cd docingest
npm install
cd server && npm install && cd ..Create .env in the repo root:
CRAWL_PROVIDER=firecrawl
FIRECRAWL_API_KEY=fc-your-api-key-here
FIRECRAWL_API_URL=https://api.firecrawl.dev/v1
REACT_APP_API_URL=http://localhost:8001/api
REDIS_HOST=localhost
REDIS_PORT=6380For local Docker with self-hosted Firecrawl:
CRAWL_PROVIDER=firecrawl
FIRECRAWL_API_URL=http://localhost:3002/v1
REACT_APP_API_URL=http://localhost:8001/api
REDIS_HOST=localhost
REDIS_PORT=6380For setup details, use these guides:
Choose the local services you want:
Run everything local:
docker compose --profile firecrawl --profile tools up -dRun only Redis:
docker compose up -d redisRun Redis and Firecrawl without the Redis UI:
docker compose --profile firecrawl up -dRun Redis with the Redis UI:
docker compose --profile tools up -dRun the app locally:
npm run devIf port 8001 is already busy, use the alternate local API port:
npm run dev:localThen open http://localhost:8000.
After indexing docs, build the Redis search index:
cd server
npm run build-indexAdd DocIngest to Claude Code:
claude mcp add docingest -- npx -y @docingest/mcp-serverUse the same package as a CLI:
npx @docingest/mcp-server find react
npx @docingest/mcp-server read react.dev --topic hooks --max-tokens 5000
npx @docingest/mcp-server search "server components" --limit 5MCP tools:
find-docsfinds a library or docs domainread-docsfetches focused documentation contentquery-docssearches across indexed docs
For editor-specific config, see the MCP server README.
Use these when you need more than the happy path:
- Redis setup for local/self-hosted Redis, indexing, and verification
- Firecrawl setup for hosted or self-hosted crawling
- Docker run modes for all-in-one or partial local services
- Nginx setup for production reverse proxy configuration
- Performance notes for speedups and next optimization work
- Reference for storage, API, deployment shape, and repo details
- React + TypeScript + Tailwind CSS
- Node.js + Express + TypeScript
- Firecrawl for crawling
- Redis for autocomplete, full-text search, and cached docs
- File-based markdown storage
Contributions are welcome, especially around crawling quality, search/ranking, MCP ergonomics, docs UX, and self-hosting.
MIT


