Skip to content

VamshiKrish2825/Agentic-rag-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agentic RAG — LangGraph + FAISS

A lightweight Agentic RAG (Retrieval-Augmented Generation) system built from scratch with LangGraph and FAISS.

Instead of a fixed retrieve → generate pipeline, this agent reasons about its own retrieval:

User Query
    ↓
Query Rewriter   — makes vague questions retrieval-friendly
    ↓
Retriever        — fetches top-k chunks from FAISS
    ↓
Relevance Grader — keeps only useful chunks
    ↓ (relevant found)         ↓ (none found, retry allowed)
Generator                  Query Rewriter  ←─ loop
    ↓
Final Answer

Features

Feature Detail
Self-correcting retrieval If the first retrieval returns irrelevant chunks, the agent rewrites the query and tries again (max 2 retries)
Conversation memory History is passed to every node so follow-up questions work naturally
Relevance grading An LLM grades each chunk before it reaches the generator — no noisy context
Local embeddings all-MiniLM-L6-v2 via HuggingFace — no embedding API calls needed
Swappable LLM Works with OpenAI, Groq (free), or Ollama (fully local)
Gradio UI One-command chat interface at http://localhost:7860

Quick Start

1. Clone and install

git clone https://github.com/<your-username>/agentic-rag-agent.git
cd agentic-rag-agent

python -m venv venv
source venv/bin/activate        # Windows: venv\Scripts\activate
pip install -r requirements.txt

2. Set your API key

cp .env.example .env
# Edit .env and add your OPENAI_API_KEY (or Groq/Ollama — see .env.example)

3. Add your documents

Drop .pdf, .txt, or .md files into data/sample_docs/. Two sample documents about RAG and LangGraph are already included for testing.

4. Build the vector index

python ingest.py

This chunks your documents, embeds them, and saves a FAISS index to data/faiss_index/. Only needs to run once (or again if you add new documents with --force).

5. Start the app

python app.py

Open http://localhost:7860 and start chatting.


Project Structure

agentic-rag-agent/
│
├── app.py                  # Gradio chat UI — entry point
├── ingest.py               # CLI script to build the FAISS index
├── requirements.txt
├── .env.example            # Copy to .env and fill in your keys
│
├── src/
│   ├── rag_agent.py        # LangGraph graph definition + node functions
│   ├── retriever.py        # FAISS ingestion & retrieval
│   ├── state.py            # RAGState TypedDict
│   └── prompts.py          # All prompt templates in one place
│
└── data/
    ├── sample_docs/        # Put your source documents here
    │   ├── intro_to_rag.md
    │   └── langgraph_guide.md
    └── faiss_index/        # Auto-generated by ingest.py (git-ignored)

LLM Provider Options

Edit .env to switch providers — no code changes needed.

OpenAI (default)

OPENAI_API_KEY=sk-...
LLM_MODEL=gpt-4o-mini

Groq (free tier, fast)

OPENAI_API_KEY=gsk_...
OPENAI_BASE_URL=https://api.groq.com/openai/v1
LLM_MODEL=llama-3.3-70b-versatile

Ollama (fully local, no API key)

OPENAI_API_KEY=ollama
OPENAI_BASE_URL=http://localhost:11434/v1
LLM_MODEL=qwen2.5:7b

How the LangGraph Agent Works

The agent is defined in src/rag_agent.py as a StateGraph.

             ┌──────────────┐
  START ───► │ query_rewriter│ ◄──── (retry loop)
             └──────┬───────┘
                    │
             ┌──────▼───────┐
             │   retriever  │
             └──────┬───────┘
                    │
             ┌──────▼───────┐
             │ relevance    │
             │   check      │
             └──────┬───────┘
                    │
         ┌──────────┴──────────┐
   relevant?                not relevant?
         │                         │
  ┌──────▼───────┐          (rewrites < 2)
  │  generator   │                 │
  └──────┬───────┘         back to rewriter
         │
        END

Each node is a plain Python function that receives the shared RAGState and returns a partial update. Conditional routing (route_after_relevance) decides whether to generate or retry.


Configuration

All tunable parameters live in .env:

Variable Default Description
LLM_MODEL gpt-4o-mini LLM model name
EMBED_MODEL all-MiniLM-L6-v2 HuggingFace embedding model
TOP_K 4 Number of chunks to retrieve
CHUNK_SIZE 512 Tokens per chunk
CHUNK_OVERLAP 64 Overlap between consecutive chunks
DATA_DIR data/sample_docs Directory of source documents
INDEX_DIR data/faiss_index Where the FAISS index is saved

Tech Stack


License

MIT — free to use, modify, and distribute.


Built by Vamshi as part of a hands-on AI engineering portfolio.

About

Agentic RAG system using LangGraph, FAISS, and Gradio with self-correcting retrieval

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages