Skip to content

Latest commit

 

History

History
134 lines (105 loc) · 6.47 KB

File metadata and controls

134 lines (105 loc) · 6.47 KB
name graphanything
description Turn anything into a navigable knowledge graph. 10 schema presets, 8 extractors (markdown / json-yaml / openapi / fstree / chatlog / LLM-entity / VLM-stub / noop), human-in-the-loop review.
trigger /graphanything

/graphanything

Build a knowledge graph from arbitrary inputs — markdown vaults, OpenAPI specs, contracts, meeting notes, chat logs, filesystem trees — by picking a schema preset, sampling, reviewing, and running. The graph comes out with full provenance: every node and edge knows who extracted it, when, from which file, and (for LLM extractions) what evidence span justified it.

LLM-driven extraction goes through GraphAnything's built-in OpenAI-compatible client, which talks to any chat.completions-shaped endpoint (vLLM serve, llama.cpp, Ollama, LM Studio, OpenAI, …). Configure via GA_API_BASE / GA_MODEL / GA_API_KEY (legacy OPENAI_* / API_KEY/API_BASE/SUMMARY_MODEL_NAME are also accepted).

When to invoke

Trigger on any of:

  • "graph this vault / spec / folder"
  • "build a knowledge graph of …"
  • "turn these files into a graph"
  • "extract entities and relations from …"
  • The user types /graphanything …

What graphanything is for

The Skill exposes 17 MCP tools that map onto the same Session state machine the CLI uses. The basic loop is:

open_session    →  propose_schema    →  sample        →  review
                       (or pick preset)        ↓               ↓
                                            refine_schema   run → graph.json
                                                              ↓
                                                       update / versions / diff

Tools (17 total):

  • graphanything_open_session(inputs, preset?, extractor?) — start. Returns session_id.
  • graphanything_list_presets() — 10 built-in presets to choose from.
  • graphanything_list_extractors() — 8 extractors (rule + LLM + VLM stub).
  • graphanything_propose_schema(session_id, n=3, llm=False) — fill in an empty schema.
  • graphanything_refine_schema(session_id, instruction, llm=False)add Foo entity, rename A to B, …
  • graphanything_sample(session_id, n=5) — extract from N inputs into pending. Returns preview.
  • graphanything_review(session_id, actions[]) — accept_all / accept / reject / merge / rule.
  • graphanything_run(session_id, out_dir?) — full extraction → graph.json + version snapshot.
  • graphanything_status(session_id) — counts + schema + running cost.
  • graphanything_ask(session_id, question, llm=False) — natural-language query over the graph.
  • graphanything_explain(session_id, target) — full provenance for one node / edge.
  • graphanything_update(session_id, out_dir?) — re-extract only inputs whose source_hash changed.
  • graphanything_versions(out_root?) — list snapshots written by run / update.
  • graphanything_diff(v_old, v_new) — diff two snapshots (added / removed / modified).
  • graphanything_federate(graphs, out, fuzzy?, llm?) — merge multiple graphs into one universe.
  • graphanything_eval(session_id, llm=False, judge_n=20) — coverage / dedup / per-extractor / sampled LLM-judge.
  • graphanything_render(session_id, fmt="mermaid") — 9 formats: mermaid / html / svg / cypher / graphml / ascii / json / canvas / timeline.

What you must do when invoked

  1. Ask the user ONE question to disambiguate scope, only if it's ambiguous: "Should I graph as <preset_a> or <preset_b>?" — pick from graphanything_list_presets. Skip if the user already named a preset or it's obvious from the inputs (*.md vault → obsidian-vault, openapi.yamlopenapi, *.jsonl chat → chat-log).

  2. Call graphanything_open_session(inputs=…, preset=…).

  3. If the schema is empty AND no preset matched: call graphanything_propose_schema(session_id, llm=True). Otherwise skip — the preset already filled the schema.

  4. Call graphanything_sample(session_id, n=5) to see a preview.

  5. Show the preview to the user (use graphanything_render with fmt="mermaid" for ≤60 nodes; otherwise just summarise types/counts). Ask if anything in the schema needs refining.

  6. If yes → graphanything_refine_schema(...), then re-sample. If no → graphanything_review(session_id, actions=[{"op":"accept_all"}]), then graphanything_run(session_id, out_dir=...).

  7. After run, paste the resulting out_path and a mermaid render of the final graph back to the user.

Available schema presets

Use graphanything_list_presets to see live state. The built-ins as of 0.1.0:

Preset Use when
obsidian-vault A folder of .md notes (Obsidian / Notion exports)
openapi An OpenAPI / Swagger spec
papers Generic LLM-driven paper extraction
codebase Source-code repo (LLM-driven)
contracts Legal contracts (parties / clauses / amounts / governing law)
pr-review A PR review thread (files / functions / reviewers / concerns)
meeting Meeting notes / transcripts
chat-log Slack / Claude Code .jsonl / Discord transcripts
db-schema SQL DDL / migrations / ORM models
fstree Plain filesystem tree exploration

Notes for Claude

  • Provenance: every node/edge in graph.json is stamped with extractor_id, extractor_version, extraction_time, source_hash. LLM-extracted edges additionally carry rationale and evidence_span. Use graphanything_explain to show this when the user asks "where did this come from?".

  • LLM endpoint: LLM-gated calls go through any OpenAI-compatible chat-completions endpoint (GA_API_BASE / GA_MODEL / GA_API_KEY). If the env vars aren't set, LLM ops error out cleanly — fall back to rule-based presets / extractors.

  • Cost control: open_session accepts budget={max_tokens, max_dollars, max_api_calls}. Honour it when the user mentions a cap.

  • Big inputs: don't render >60 nodes as mermaid — call render with fmt="ascii" or fmt="html" (returns a file path) instead. Pass budget_tokens=N to PageRank-prune large graphs to a target size.

  • Multiple sessions: each call to open_session creates a new one. The session_id is the only handle you need — pass it to every other tool.

  • Updating: when source files change, graphanything_update re-extracts only the changed ones and writes a new version snapshot; graphanything_diff <v_old> <v_new> shows what's different.