A Retrieval-Augmented Generation system that replaces vector search with a structured knowledge graph, enabling relationship traversal, multi-hop reasoning, and precise aggregation over documents. :contentReference[oaicite:0]{index=0}
Traditional RAG retrieves document chunks by vector similarity and hands them to an LLM. This breaks down in several ways:
- Lost Relationships — Chunking destroys the connections between entities. Questions like "How is X connected to Y?" cannot be answered when the relationship lives across chunk boundaries.
- No Multi-Hop Reasoning — Cannot chain facts. If A relates to B and B relates to C, a vector search for A never surfaces C.
- Vague Matching — Semantic similarity fails when query phrasing differs from document phrasing. "Handle large files efficiently" may never match text about "Node.js Streams".
- Cannot Aggregate — Ask "How many hotels have a swimming pool?" across 100 FAQ documents and RAG retrieves a few similar chunks — it cannot count, filter, or reason globally. It guesses.
Knowledge Graph RAG solves this by extracting a structured graph from the document, storing it in Neo4j, and querying it with precise Cypher — enabling exact relationship traversal, multi-hop reasoning, and aggregation.
Neo4j AuraDB showing extracted entities, relationship
Full graph visualisation via Cypher (MATCH (n)-[r]->(m) RETURN n, r, m LIMIT 40) — entities as nodes, relationships as directed edges.
The document is loaded and chunked, then passed to LLMGraphTransformer (GPT-4.1-mini).
The LLM dynamically discovers entity types and relationship types from the text — no schema is predefined.
Entities and relationships are then stored in Neo4j, with a consistent base label (__Entity__) to ensure uniform querying and traversal.
When a user query is received:
- A full-text search is used to identify relevant entities from the graph
- These entities act as anchors for retrieval
- The system explores their connected nodes and relationships (typically 1–2 hops away)
- This produces structured relationship triples in the form:
source --[RELATION]--> target
For queries that require more complex logic (such as counts, comparisons, or filtering), an LLM generates a Cypher query.
This query is strictly constrained to use only the verified entities identified earlier, ensuring accuracy and preventing hallucination.
All retrieved information is merged into a clean, deduplicated set of graph relationships.
The collected graph relationships are passed to an LLM as structured context.
The LLM:
- synthesizes a final answer
- relies only on the provided graph data
- avoids introducing any external or unsupported information
This results in answers that are:
- grounded
- explainable
- logically consistent
Pure KG-RAG answers relationship questions precisely but struggles with questions that need raw document prose — definitions, examples, explanations that don't compress cleanly into triples.
The hybrid version adds a vector layer over raw document chunks alongside the graph. Every query now combines:
- structured graph relationships
- semantically relevant document passages
The synthesis LLM receives both and produces a more complete answer — combining structured reasoning with natural language context.
Benefits
- Answers relationship, multi-hop, and aggregation questions that pure vector search cannot handle
- Entity grounding is exact — the LLM operates only on verified graph nodes
- No dependency on embeddings for core reasoning
Tradeoffs
- Expensive indexing — graph construction requires LLM processing over all chunks
- Infrastructure overhead — requires Neo4j alongside vector storage
- Overkill for simple Q&A — adds latency and cost where simple retrieval is enough
- Slower updates — new data requires graph extraction, not just re-embedding
| Component | Technology |
|---|---|
| LLM | OpenAI GPT-4.1-mini |
| Embeddings | OpenAI text-embedding-3-small |
| Graph Database | Neo4j AuraDB (cloud) |
| Entity Grounding | Neo4j Lucene Full-Text Index |
| Vector Store | Neo4j Vector Index (document chunks) |
| Graph Extraction | LangChain LLMGraphTransformer |
| Framework | LangChain, langchain-neo4j, langchain-experimental |
pip install langchain langchain-experimental langchain-openai \
langchain-neo4j langchain-text-splitters pypdf python-dotenvCreate a .env file in the project root:
OPENAI_API_KEY=your-openai-api-key
NEO4J_PASSWORD=your-neo4j-password
Set your Neo4j connection details in the notebook:
NEO4J_URI = "neo4j+s://<your-instance>.databases.neo4j.io"
NEO4J_USERNAME = "<your-username>"
NEO4J_DATABASE = "<your-database>"

