Skip to content

MCP Tools

ramacharanreddy-k edited this page Apr 27, 2026 · 2 revisions

WikiNow exposes 21 tools via FastMCP. The host AI (Claude Code, Codex, Cursor, Copilot) calls these as native tool calls during conversation.


How Tools Chain Together

The tools aren't meant to be used in isolation. A typical ingest session chains 8-10 tool calls:

ingest_url("https://...")     --> raw content returned to the AI
read("index.md")             --> AI sees existing wiki state
write("sources/paper.md")    --> AI writes source summary
index_article(...)            --> page becomes searchable
write("concepts/topic.md")   --> AI updates a concept page
index_article(...)            --> re-indexed with new content
  ... repeat for more pages ...
append_log("Ingested: ...")   --> session logged
mark_compiled("paper.md")    --> source marked as processed

A query session is simpler:

search("attention mechanism") --> ranked results
read("concepts/attention.md") --> AI reads the page
  ... AI synthesizes answer in conversation ...
write("queries/attention.md") --> valuable answer filed back
index_article(...)            --> filed answer becomes searchable
append_log("Query: ...")      --> session logged

Ingest (3 tools)

ingest_url(url: str) -> str

Fetches a URL and saves to raw/. Auto-detects YouTube URLs and routes them through yt-dlp for English subtitle extraction (with Whisper fallback if no subtitles exist). All other URLs go through Jina Reader which handles JavaScript-heavy pages and PDF URLs.

Returns the full content so the AI can immediately start compiling. If the content hash matches something already ingested, returns "Already ingested (duplicate content). Skipped." instead.

Uses the configured Jina API key if set (for higher rate limits).

ingest_text(name: str, content: str) -> str

Saves text content directly to raw/. Useful when the user pastes content in conversation -- meeting notes, copied articles, research summaries. The name becomes the filename after slugification.

Same dedup check as ingest_url.

ingest_file(path: str) -> str

Ingests a local file. Detects type by extension:

Extension What happens
.pdf Text extracted page-by-page via pymupdf
.epub Chapters extracted via ebooklib + BeautifulSoup
.mp3 .wav .m4a .ogg .flac .webm Transcribed via Whisper turbo model (English only, requires ffmpeg)
Anything else Read as plain text

Read / Write (5 tools)

read(path: str) -> str

Reads a wiki article. Path is relative to wiki/ -- e.g., read("concepts/attention.md").

Returns the full file content. Returns "Article not found: ..." if missing, "Invalid path: ..." if path traversal detected (../ or prefix bypass).

write(path: str, content: str) -> str

Creates or overwrites a wiki article. Creates parent directories if needed.

Returns "Written: concepts/attention.md". Same path traversal protection as read.

index_article(path, title, summary, tags, confidence, links, created, updated) -> str

The AI calls this after every write() to make the page searchable and tracked.

  • tags -- list of strings: ["ml", "attention", "architecture"]
  • confidence -- one of: "high", "medium", "low", "conflict"
  • links -- paths this page links to: ["concepts/transformer.md", "sources/paper.md"]
  • created / updated -- ISO date strings, defaults to current UTC time if empty

What it does behind the scenes: upserts the article row, rebuilds the FTS5 search entry from the file content (with frontmatter stripped), replaces all tag associations, replaces all link associations, commits.

Returns "Indexed: concepts/attention.md (id=7)".

index_raw(path, source_url, content_hash) -> str

Indexes a raw source in the database. The ingest tools call this automatically, but it's exposed for manual use.

mark_compiled(raw_path: str) -> str

Marks a raw source as compiled. The AI calls this as the final step after processing a source. Changes the source from "pending" to "compiled" in get_project_stats() and lint().


Search (2 tools)

search(query: str, max_results: int = 10) -> list[dict]

Full-text search with porter stemming. "transformers" matches "transformer", "running" matches "run".

Returns:

[
  {
    "path": "concepts/transformer.md",
    "title": "Transformer Architecture",
    "summary": "Self-attention mechanism for sequence modeling",
    "confidence": "high",
    "rank": -2.5
  }
]

Lower rank = better match. Returns [] for no matches. Special characters in queries are safely escaped.

search_web(query: str, max_results: int = 5) -> list[dict]

Web search via Ollama API. Requires OLLAMA_API_KEY env var.

Returns [{"title": "...", "url": "...", "content": "..."}].

The schema instructs the AI to try this first (preserves the AI's native search quota), then fall back to built-in search if it fails.


List / Stats (6 tools)

list_all_articles() -> list[dict]

All wiki pages sorted by most recently updated:

[{"path": "concepts/ai.md", "title": "AI", "summary": "...", "confidence": "high", "created": "2026-04-25", "updated": "2026-04-27"}]

list_all_raw() -> list[dict]

All raw sources sorted by most recently ingested:

[{"path": "paper.md", "source_url": "https://...", "content_hash": "abc123...", "compiled": 1, "ingested_at": "2026-04-27T..."}]

list_all_tags() -> list[dict]

All tags sorted by count:

[{"tag": "ml", "count": 5}, {"tag": "attention", "count": 3}]

get_project_stats() -> dict

{"articles": 12, "raw_sources": 8, "raw_compiled": 6, "raw_pending": 2, "links": 47, "tags": 15, "contradictions": 1}

get_all_contradictions() -> list[dict]

Articles where confidence is "conflict". Same shape as list_all_articles() but filtered.

get_gaps() -> str

Returns the raw content of wiki/gaps.md. Returns "No gaps file found." if the file doesn't exist.


Maintenance (5 tools)

lint() -> dict

{
  "health_score": 85,
  "orphan_pages": [{"path": "concepts/tokenization.md", "title": "...", ...}],
  "dead_links": [{"target_path": "concepts/missing.md", "source_path": "concepts/nlp.md"}],
  "uncompiled_sources": [{"path": "new-paper.pdf", ...}]
}

Health score: 100 minus 5 per issue. The AI reads the results and fixes what it can -- adding wikilinks to orphans, creating missing pages, compiling pending sources.

append_log(entry: str) -> str

Appends ## [2026-04-27] entry text to wiki/log.md. The AI logs every ingest and query.

update_schema(section: str, content: str) -> str

Finds ## <section> in CLAUDE.md and replaces its content. If the section doesn't exist, appends it. This is how the schema accumulates project-specific conventions over time.

re_ingest(source: str) -> str

Re-reads a raw source from disk and returns its content. Useful when the AI needs to reprocess a source with updated schema instructions.

Returns "Source not found: ..." if the file doesn't exist.

export() -> str

Concatenates overview, index, and all article pages into a single markdown file. Returns the output path.

Clone this wiki locally