| name | graphanything |
|---|---|
| description | Turn anything into a navigable knowledge graph. 10 schema presets, 8 extractors (markdown / json-yaml / openapi / fstree / chatlog / LLM-entity / VLM-stub / noop), human-in-the-loop review. |
| trigger | /graphanything |
Build a knowledge graph from arbitrary inputs — markdown vaults, OpenAPI specs, contracts, meeting notes, chat logs, filesystem trees — by picking a schema preset, sampling, reviewing, and running. The graph comes out with full provenance: every node and edge knows who extracted it, when, from which file, and (for LLM extractions) what evidence span justified it.
LLM-driven extraction goes through GraphAnything's built-in
OpenAI-compatible client, which talks to any chat.completions-shaped
endpoint (vLLM serve, llama.cpp, Ollama, LM Studio, OpenAI, …). Configure
via GA_API_BASE / GA_MODEL / GA_API_KEY (legacy
OPENAI_* / API_KEY/API_BASE/SUMMARY_MODEL_NAME are also accepted).
Trigger on any of:
- "graph this vault / spec / folder"
- "build a knowledge graph of …"
- "turn these files into a graph"
- "extract entities and relations from …"
- The user types
/graphanything …
The Skill exposes 17 MCP tools that map onto the same Session state machine the CLI uses. The basic loop is:
open_session → propose_schema → sample → review
(or pick preset) ↓ ↓
refine_schema run → graph.json
↓
update / versions / diff
Tools (17 total):
- graphanything_open_session(inputs, preset?, extractor?) — start. Returns
session_id. - graphanything_list_presets() — 10 built-in presets to choose from.
- graphanything_list_extractors() — 8 extractors (rule + LLM + VLM stub).
- graphanything_propose_schema(session_id, n=3, llm=False) — fill in an empty schema.
- graphanything_refine_schema(session_id, instruction, llm=False) —
add Foo entity,rename A to B, … - graphanything_sample(session_id, n=5) — extract from N inputs into pending. Returns preview.
- graphanything_review(session_id, actions[]) — accept_all / accept / reject / merge / rule.
- graphanything_run(session_id, out_dir?) — full extraction →
graph.json+ version snapshot. - graphanything_status(session_id) — counts + schema + running cost.
- graphanything_ask(session_id, question, llm=False) — natural-language query over the graph.
- graphanything_explain(session_id, target) — full provenance for one node / edge.
- graphanything_update(session_id, out_dir?) — re-extract only inputs whose
source_hashchanged. - graphanything_versions(out_root?) — list snapshots written by run / update.
- graphanything_diff(v_old, v_new) — diff two snapshots (added / removed / modified).
- graphanything_federate(graphs, out, fuzzy?, llm?) — merge multiple graphs into one universe.
- graphanything_eval(session_id, llm=False, judge_n=20) — coverage / dedup / per-extractor / sampled LLM-judge.
- graphanything_render(session_id, fmt="mermaid") — 9 formats: mermaid / html / svg / cypher / graphml / ascii / json / canvas / timeline.
-
Ask the user ONE question to disambiguate scope, only if it's ambiguous: "Should I graph as
<preset_a>or<preset_b>?" — pick fromgraphanything_list_presets. Skip if the user already named a preset or it's obvious from the inputs (*.mdvault →obsidian-vault,openapi.yaml→openapi,*.jsonlchat →chat-log). -
Call
graphanything_open_session(inputs=…, preset=…). -
If the schema is empty AND no preset matched: call
graphanything_propose_schema(session_id, llm=True). Otherwise skip — the preset already filled the schema. -
Call
graphanything_sample(session_id, n=5)to see a preview. -
Show the preview to the user (use
graphanything_renderwithfmt="mermaid"for ≤60 nodes; otherwise just summarise types/counts). Ask if anything in the schema needs refining. -
If yes →
graphanything_refine_schema(...), then re-sample. If no →graphanything_review(session_id, actions=[{"op":"accept_all"}]), thengraphanything_run(session_id, out_dir=...). -
After
run, paste the resultingout_pathand a mermaid render of the final graph back to the user.
Use graphanything_list_presets to see live state. The built-ins as of 0.1.0:
| Preset | Use when |
|---|---|
obsidian-vault |
A folder of .md notes (Obsidian / Notion exports) |
openapi |
An OpenAPI / Swagger spec |
papers |
Generic LLM-driven paper extraction |
codebase |
Source-code repo (LLM-driven) |
contracts |
Legal contracts (parties / clauses / amounts / governing law) |
pr-review |
A PR review thread (files / functions / reviewers / concerns) |
meeting |
Meeting notes / transcripts |
chat-log |
Slack / Claude Code .jsonl / Discord transcripts |
db-schema |
SQL DDL / migrations / ORM models |
fstree |
Plain filesystem tree exploration |
-
Provenance: every node/edge in
graph.jsonis stamped withextractor_id,extractor_version,extraction_time,source_hash. LLM-extracted edges additionally carryrationaleandevidence_span. Usegraphanything_explainto show this when the user asks "where did this come from?". -
LLM endpoint: LLM-gated calls go through any OpenAI-compatible chat-completions endpoint (
GA_API_BASE/GA_MODEL/GA_API_KEY). If the env vars aren't set, LLM ops error out cleanly — fall back to rule-based presets / extractors. -
Cost control:
open_sessionacceptsbudget={max_tokens, max_dollars, max_api_calls}. Honour it when the user mentions a cap. -
Big inputs: don't render >60 nodes as mermaid — call
renderwithfmt="ascii"orfmt="html"(returns a file path) instead. Passbudget_tokens=Nto PageRank-prune large graphs to a target size. -
Multiple sessions: each call to
open_sessioncreates a new one. The session_id is the only handle you need — pass it to every other tool. -
Updating: when source files change,
graphanything_updatere-extracts only the changed ones and writes a new version snapshot;graphanything_diff <v_old> <v_new>shows what's different.