Agentic RAG — LangGraph + FAISS

A lightweight Agentic RAG (Retrieval-Augmented Generation) system built from scratch with LangGraph and FAISS.

Instead of a fixed retrieve → generate pipeline, this agent reasons about its own retrieval:

User Query
    ↓
Query Rewriter   — makes vague questions retrieval-friendly
    ↓
Retriever        — fetches top-k chunks from FAISS
    ↓
Relevance Grader — keeps only useful chunks
    ↓ (relevant found)         ↓ (none found, retry allowed)
Generator                  Query Rewriter  ←─ loop
    ↓
Final Answer

Features

Feature	Detail
Self-correcting retrieval	If the first retrieval returns irrelevant chunks, the agent rewrites the query and tries again (max 2 retries)
Conversation memory	History is passed to every node so follow-up questions work naturally
Relevance grading	An LLM grades each chunk before it reaches the generator — no noisy context
Local embeddings	`all-MiniLM-L6-v2` via HuggingFace — no embedding API calls needed
Swappable LLM	Works with OpenAI, Groq (free), or Ollama (fully local)
Gradio UI	One-command chat interface at `http://localhost:7860`

Quick Start

1. Clone and install

git clone https://github.com/<your-username>/agentic-rag-agent.git
cd agentic-rag-agent

python -m venv venv
source venv/bin/activate        # Windows: venv\Scripts\activate
pip install -r requirements.txt

2. Set your API key

cp .env.example .env
# Edit .env and add your OPENAI_API_KEY (or Groq/Ollama — see .env.example)

3. Add your documents

Drop .pdf, .txt, or .md files into data/sample_docs/. Two sample documents about RAG and LangGraph are already included for testing.

4. Build the vector index

python ingest.py

This chunks your documents, embeds them, and saves a FAISS index to data/faiss_index/. Only needs to run once (or again if you add new documents with --force).

5. Start the app

python app.py

Open http://localhost:7860 and start chatting.

Project Structure

agentic-rag-agent/
│
├── app.py                  # Gradio chat UI — entry point
├── ingest.py               # CLI script to build the FAISS index
├── requirements.txt
├── .env.example            # Copy to .env and fill in your keys
│
├── src/
│   ├── rag_agent.py        # LangGraph graph definition + node functions
│   ├── retriever.py        # FAISS ingestion & retrieval
│   ├── state.py            # RAGState TypedDict
│   └── prompts.py          # All prompt templates in one place
│
└── data/
    ├── sample_docs/        # Put your source documents here
    │   ├── intro_to_rag.md
    │   └── langgraph_guide.md
    └── faiss_index/        # Auto-generated by ingest.py (git-ignored)

LLM Provider Options

Edit .env to switch providers — no code changes needed.

OpenAI (default)

OPENAI_API_KEY=sk-...
LLM_MODEL=gpt-4o-mini

Groq (free tier, fast)

OPENAI_API_KEY=gsk_...
OPENAI_BASE_URL=https://api.groq.com/openai/v1
LLM_MODEL=llama-3.3-70b-versatile

Ollama (fully local, no API key)

OPENAI_API_KEY=ollama
OPENAI_BASE_URL=http://localhost:11434/v1
LLM_MODEL=qwen2.5:7b

How the LangGraph Agent Works

The agent is defined in src/rag_agent.py as a StateGraph.

             ┌──────────────┐
  START ───► │ query_rewriter│ ◄──── (retry loop)
             └──────┬───────┘
                    │
             ┌──────▼───────┐
             │   retriever  │
             └──────┬───────┘
                    │
             ┌──────▼───────┐
             │ relevance    │
             │   check      │
             └──────┬───────┘
                    │
         ┌──────────┴──────────┐
   relevant?                not relevant?
         │                         │
  ┌──────▼───────┐          (rewrites < 2)
  │  generator   │                 │
  └──────┬───────┘         back to rewriter
         │
        END

Each node is a plain Python function that receives the shared RAGState and returns a partial update. Conditional routing (route_after_relevance) decides whether to generate or retry.

Configuration

All tunable parameters live in .env:

Variable	Default	Description
`LLM_MODEL`	`gpt-4o-mini`	LLM model name
`EMBED_MODEL`	`all-MiniLM-L6-v2`	HuggingFace embedding model
`TOP_K`	`4`	Number of chunks to retrieve
`CHUNK_SIZE`	`512`	Tokens per chunk
`CHUNK_OVERLAP`	`64`	Overlap between consecutive chunks
`DATA_DIR`	`data/sample_docs`	Directory of source documents
`INDEX_DIR`	`data/faiss_index`	Where the FAISS index is saved

Tech Stack

LangGraph — graph-based agent orchestration
LangChain — LLM abstraction, document loaders
FAISS — local vector similarity search
HuggingFace Sentence Transformers — local embeddings
Gradio — chat UI

License

MIT — free to use, modify, and distribute.

Built by Vamshi as part of a hands-on AI engineering portfolio.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic RAG — LangGraph + FAISS

Features

Quick Start

1. Clone and install

2. Set your API key

3. Add your documents

4. Build the vector index

5. Start the app

Project Structure

LLM Provider Options

How the LangGraph Agent Works

Configuration

Tech Stack

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data/sample_docs		data/sample_docs
src		src
LICENSE		LICENSE
README.md		README.md
app.py		app.py
ingest.py		ingest.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Agentic RAG — LangGraph + FAISS

Features

Quick Start

1. Clone and install

2. Set your API key

3. Add your documents

4. Build the vector index

5. Start the app

Project Structure

LLM Provider Options

How the LangGraph Agent Works

Configuration

Tech Stack

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages