-
Notifications
You must be signed in to change notification settings - Fork 0
MCP Tools
WikiNow exposes 21 tools via FastMCP. The host AI (Claude Code, Codex, Cursor, Copilot) calls these as native tool calls during conversation.
The tools aren't meant to be used in isolation. A typical ingest session chains 8-10 tool calls:
ingest_url("https://...") --> raw content returned to the AI
read("index.md") --> AI sees existing wiki state
write("sources/paper.md") --> AI writes source summary
index_article(...) --> page becomes searchable
write("concepts/topic.md") --> AI updates a concept page
index_article(...) --> re-indexed with new content
... repeat for more pages ...
append_log("Ingested: ...") --> session logged
mark_compiled("paper.md") --> source marked as processed
A query session is simpler:
search("attention mechanism") --> ranked results
read("concepts/attention.md") --> AI reads the page
... AI synthesizes answer in conversation ...
write("queries/attention.md") --> valuable answer filed back
index_article(...) --> filed answer becomes searchable
append_log("Query: ...") --> session logged
Fetches a URL and saves to raw/. Auto-detects YouTube URLs and routes them through yt-dlp for English subtitle extraction (with Whisper fallback if no subtitles exist). All other URLs go through Jina Reader which handles JavaScript-heavy pages and PDF URLs.
Returns the full content so the AI can immediately start compiling. If the content hash matches something already ingested, returns "Already ingested (duplicate content). Skipped." instead.
Uses the configured Jina API key if set (for higher rate limits).
Saves text content directly to raw/. Useful when the user pastes content in conversation -- meeting notes, copied articles, research summaries. The name becomes the filename after slugification.
Same dedup check as ingest_url.
Ingests a local file. Detects type by extension:
| Extension | What happens |
|---|---|
.pdf |
Text extracted page-by-page via pymupdf |
.epub |
Chapters extracted via ebooklib + BeautifulSoup |
.mp3 .wav .m4a .ogg .flac .webm
|
Transcribed via Whisper turbo model (English only, requires ffmpeg) |
| Anything else | Read as plain text |
Reads a wiki article. Path is relative to wiki/ -- e.g., read("concepts/attention.md").
Returns the full file content. Returns "Article not found: ..." if missing, "Invalid path: ..." if path traversal detected (../ or prefix bypass).
Creates or overwrites a wiki article. Creates parent directories if needed.
Returns "Written: concepts/attention.md". Same path traversal protection as read.
The AI calls this after every write() to make the page searchable and tracked.
-
tags-- list of strings:["ml", "attention", "architecture"] -
confidence-- one of:"high","medium","low","conflict" -
links-- paths this page links to:["concepts/transformer.md", "sources/paper.md"] -
created/updated-- ISO date strings, defaults to current UTC time if empty
What it does behind the scenes: upserts the article row, rebuilds the FTS5 search entry from the file content (with frontmatter stripped), replaces all tag associations, replaces all link associations, commits.
Returns "Indexed: concepts/attention.md (id=7)".
Indexes a raw source in the database. The ingest tools call this automatically, but it's exposed for manual use.
Marks a raw source as compiled. The AI calls this as the final step after processing a source. Changes the source from "pending" to "compiled" in get_project_stats() and lint().
Full-text search with porter stemming. "transformers" matches "transformer", "running" matches "run".
Returns:
[
{
"path": "concepts/transformer.md",
"title": "Transformer Architecture",
"summary": "Self-attention mechanism for sequence modeling",
"confidence": "high",
"rank": -2.5
}
]Lower rank = better match. Returns [] for no matches. Special characters in queries are safely escaped.
Web search via Ollama API. Requires OLLAMA_API_KEY env var.
Returns [{"title": "...", "url": "...", "content": "..."}].
The schema instructs the AI to try this first (preserves the AI's native search quota), then fall back to built-in search if it fails.
All wiki pages sorted by most recently updated:
[{"path": "concepts/ai.md", "title": "AI", "summary": "...", "confidence": "high", "created": "2026-04-25", "updated": "2026-04-27"}]All raw sources sorted by most recently ingested:
[{"path": "paper.md", "source_url": "https://...", "content_hash": "abc123...", "compiled": 1, "ingested_at": "2026-04-27T..."}]All tags sorted by count:
[{"tag": "ml", "count": 5}, {"tag": "attention", "count": 3}]{"articles": 12, "raw_sources": 8, "raw_compiled": 6, "raw_pending": 2, "links": 47, "tags": 15, "contradictions": 1}Articles where confidence is "conflict". Same shape as list_all_articles() but filtered.
Returns the raw content of wiki/gaps.md. Returns "No gaps file found." if the file doesn't exist.
{
"health_score": 85,
"orphan_pages": [{"path": "concepts/tokenization.md", "title": "...", ...}],
"dead_links": [{"target_path": "concepts/missing.md", "source_path": "concepts/nlp.md"}],
"uncompiled_sources": [{"path": "new-paper.pdf", ...}]
}Health score: 100 minus 5 per issue. The AI reads the results and fixes what it can -- adding wikilinks to orphans, creating missing pages, compiling pending sources.
Appends ## [2026-04-27] entry text to wiki/log.md. The AI logs every ingest and query.
Finds ## <section> in CLAUDE.md and replaces its content. If the section doesn't exist, appends it. This is how the schema accumulates project-specific conventions over time.
Re-reads a raw source from disk and returns its content. Useful when the AI needs to reprocess a source with updated schema instructions.
Returns "Source not found: ..." if the file doesn't exist.
Concatenates overview, index, and all article pages into a single markdown file. Returns the output path.