🤖 GraphWalker: Agentic Knowledge Graph Question Answering via Synthetic Trajectory Curriculum

📢 News

Date	Update
🚀 2026.03	GraphWalker achieves SOTA on CWQ (79.6 EM) and WebQSP (91.5 EM) among all agentic KGQA methods
🔗 2026.03	Implemented SFT with LLaMA-Factory and RL training with the Slime framework

🧭 The Core Concept

Existing KGQA paradigms are limited by static reasoning scopes and a lack of autonomous, global-scale exploration, leading to poor generalization on complex structures. GraphWalker overcomes these by establishing a robust exploration and error-recovery prior through a two-stage synthetic curriculum, unlocking a superior reasoning ceiling via RL on global knowledge graphs.

✨ Key Innovations

1 · Synthetic Trajectory Synthesis

Component	Role
GraphSynth	Covers complex topological structures — Composition, Conjunction — to teach structured graph navigation
GraphRoll	Equips the agent with self-correction and backtracking, conditioned on live environment feedback

2 · Agentic KG Interaction Loop

At each turn $t$, the agent emits a fully structured response:

┌──────────────────┬──────────────────────────────────────────────────┐
│  <think>         │  Internal reasoning & strategy planning          │
│  <kg-query>      │  Precise tool calls: get_relations, get_triples  │
│  <information>   │  Real-time feedback from the Virtuoso KG server  │
│  <answer>        │  Final grounded prediction                       │
└──────────────────┴──────────────────────────────────────────────────┘

🔭 System Pipeline

Full Pipeline: Data Construction → Two-Stage SFT → RL Optimization

📊 Main Experimental Results

GraphWalker achieves SOTA across all agentic KGQA methods. Bold = best, underline = second-best.

Method	Backbone	CWQ EM	CWQ F1	WebQSP EM	WebQSP F1
🔹 Vanilla LLMs (IO Prompt)
IO Prompt	Qwen2.5-3B-Instruct	22.0	17.7	44.6	30.3
IO Prompt	Qwen2.5-7B-Instruct	25.7	20.7	50.9	33.2
IO Prompt	GPT-4o-mini	45.5	33.6	47.1	39.3
IO Prompt	DeepSeek-V3.2	50.1	43.5	63.8	55.7
🔹 Agentic KGQA Methods
RoG	LLaMA-2-7B-Instruct	62.6	56.2	85.7	70.8
ToG	GPT-4	69.5	—	81.9	—
ToG-2.0	GPT-3.5	68.9	65.8	77.8	74.5
GoG	GPT-4	75.2	—	84.4	—
KBQA-o1	LLaMA3.1-8B-Instruct	—	—	75.8	82.1
KG-Agent	LLaMA2-7B-Instruct	72.2	69.8	83.3	81.0
†KG-R1	Qwen2.5-3B-Instruct	66.8	61.7	82.1	78.9
🔸 GraphWalker (Our Method)
†Vanilla Agent	Qwen2.5-7B-Instruct	40.7	33.2	68.4	66.1
†Vanilla Agent	GPT-4o-mini	63.4	60.3	79.6	70.6
†Vanilla Agent	DeepSeek-V3.2	69.8	63.5	76.7	71.8
GraphWalker-7B-SFT	Qwen2.5-7B-Instruct	68.3	63.2	82.0	79.1
GraphWalker-3B-SFT-RL	Qwen2.5-3B-Instruct	70.9	65.2	83.5	81.7
GraphWalker-8B-SFT-RL	LLaMA3.1-8B-Instruct	78.5	69.6	88.2	84.5
GraphWalker-7B-SFT-RL	Qwen2.5-7B-Instruct	79.6	74.2	91.5	88.6

† denotes models evaluated under our framework with full global KG access.

🛠️ Installation & Setup

1 · Environment

git clone https://github.com/GraphWalker/GraphWalker.git
cd GraphWalker
conda create -n graphwalker python=3.9 -y
conda activate graphwalker
pip install -r requirements.txt

2 · Virtuoso KG Server

Follow virtuoso-opensource/README.md to load Freebase and verify your SPARQL_ENDPOINT is accessible before running any evaluation.

🚀 Quick Start

1 · Evaluation (vLLM)

vllm serve "/path/to/GraphWalker-7B" \
    --host 0.0.0.0 --port 22240 \
    --served-model-name graphwalker-7b \
    --gpu-memory-utilization 0.9 --dtype auto \
    --chat-template "/path/to/chat_template.jinja"

bash run_eval_remote_vllm.sh

💡 Download the pretrained model from 🤗 HuggingFace

2 · Data Generation

# Step 1 — Random walk path generation
python kgqa_agent/scripts/random_walk.py \
    --config kgqa_agent/configs/random_walk/3-5hop/random_walk_10k.yaml

# Step 2 — QA / Info synthesis / Trajectory generation
bash pipeline_scripts/qa_gen.sh
bash pipeline_scripts/info_syn.sh
bash pipeline_scripts/traj_gen.sh

3 · Training

SFT — Use LLaMA-Factory for stage-wise fine-tuning.

RL (GRPO) — Run via the Slime framework:

cd slime/examples/graphwalker
bash examples/graphwalker/run_qwen2.5_7B_sft.sh

📦 Data Formats

Trajectory Format

{
  "raw_question": "What state was Barack Obama born in?",
  "steps": [
    {
      "step_index": 0,
      "think": "I need to find where Barack Obama was born...",
      "action": "get_relations(\"Barack Obama\")",
      "information": ["people.person.place_of_birth", "..."]
    }
  ]
}

📄 Citation

If you find GraphWalker useful in your research, please consider citing:

@misc{xu2026graphwalkeragenticknowledgegraph,
      title={GraphWalker: Agentic Knowledge Graph Question Answering via Synthetic Trajectory Curriculum}, 
      author={Shuwen Xu and Yao Xu and Jiaxiang Liu and Chenhao Yuan and Wenshuo Peng and Jun Zhao and Kang Liu},
      year={2026},
      eprint={2603.28533},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2603.28533}, 
}

📖 Preprint available on arXiv:2603.28533

_{Made with ❤️ ·
Paper ·
Model}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
datasets		datasets
scripts		scripts
slime		slime
virtuoso-opensource		virtuoso-opensource
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 GraphWalker: Agentic Knowledge Graph Question Answering via Synthetic Trajectory Curriculum

📢 News

📑 Table of Contents

🧭 The Core Concept

✨ Key Innovations

1 · Synthetic Trajectory Synthesis

2 · Agentic KG Interaction Loop

🔭 System Pipeline

📊 Main Experimental Results

🛠️ Installation & Setup

1 · Environment

2 · Virtuoso KG Server

🚀 Quick Start

1 · Evaluation (vLLM)

2 · Data Generation

3 · Training

📦 Data Formats

Trajectory Format

📄 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

🤖 GraphWalker: Agentic Knowledge Graph Question Answering via Synthetic Trajectory Curriculum

📢 News

📑 Table of Contents

🧭 The Core Concept

✨ Key Innovations

1 · Synthetic Trajectory Synthesis

2 · Agentic KG Interaction Loop

🔭 System Pipeline

📊 Main Experimental Results

🛠️ Installation & Setup

1 · Environment

2 · Virtuoso KG Server

🚀 Quick Start

1 · Evaluation (vLLM)

2 · Data Generation

3 · Training

📦 Data Formats

Trajectory Format

📄 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages