📄 AI PDF Assistant using RAG and ChromaDB

An AI-powered PDF Question Answering system built using Retrieval Augmented Generation (RAG), ChromaDB, Sentence Transformers, and Streamlit.

This application allows users to upload any PDF and ask questions based on its content.

📸 Application Demo

🚀 Features

📂 Upload any PDF file
✂️ Automatic text extraction from PDF
🧠 Semantic chunk embedding using Sentence Transformers
🗄 Vector storage using ChromaDB
🔎 Similarity-based retrieval
💬 Question answering from uploaded document
🌐 Streamlit web interface

🛠 Tech Stack

Python
Streamlit
ChromaDB
Sentence Transformers
PyPDF

📁 Project Structure

RAG_Project │ ├── app.py # Streamlit UI ├── pdf_utils.py # PDF text extraction ├── main.py # Basic RAG script ├── requirements.txt ├── README.md ├── .gitignore └── data/

⚙️ Installation

Clone the repository

git clone https://github.com/Varshakaleeswaran/RAG_Project.git

cd RAG_Project

Create virtual environment

python -m venv venv venv\Scripts\activate # Windows

Install dependencies

pip install -r requirements.txt

▶️ Run the Application

streamlit run app.py

Open browser at:

http://localhost:8501

🧠 How It Works

User uploads a PDF
Text is extracted using PyPDF
Text is split into chunks
Each chunk is converted into embeddings
Embeddings are stored in ChromaDB
User asks a question
Query is embedded and matched with similar chunks
Most relevant content is returned as answer

🎯 Use Cases

📚 Academic Notes Assistant
📄 Research Paper Analyzer
🤖 Technical Documentation Assistant
📑 Legal Document Search
📊 Company Policy QA System

📌 Future Improvements

Persistent ChromaDB storage
OpenAI/GPT-based answer generation
Chat history memory
Multi-PDF support
Deployment on Streamlit Cloud

👩‍💻 Author

Varsha Kaleeswaran

⭐ If you like this project

Give it a star on GitHub!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 AI PDF Assistant using RAG and ChromaDB

📸 Application Demo

🚀 Features

🛠 Tech Stack

📁 Project Structure

⚙️ Installation

▶️ Run the Application

🧠 How It Works

🎯 Use Cases

📌 Future Improvements

👩‍💻 Author

⭐ If you like this project

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
screenshots		screenshots
.gitignore		.gitignore
README.md		README.md
app.py		app.py
main.py		main.py
pdf_utils.py		pdf_utils.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

📄 AI PDF Assistant using RAG and ChromaDB

📸 Application Demo

🚀 Features

🛠 Tech Stack

📁 Project Structure

⚙️ Installation

▶️ Run the Application

🧠 How It Works

🎯 Use Cases

📌 Future Improvements

👩‍💻 Author

⭐ If you like this project

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages