A RAG (Retrieval-Augmented Generation) app that lets you upload PDF files and ask natural language questions about their content using Google Gemini.
PDF → Extract Text → Split into Chunks → Generate Embeddings → Store in FAISS → User asks Question → Retrieve Relevant Chunks → Gemini generates Answer
- Python
- Streamlit
- LangChain
- FAISS
- Google Gemini API
- PyPDF2
rag-document-qa/
├── app.py
├── requirements.txt
├── .env
├── .gitignore
└── README.md
git clone https://github.com/Vishalkrishna3434/rag-document-qa.git
cd rag-document-qapip install -r requirements.txtCreate a .env file in the project folder:
GOOGLE_API_KEY=your_api_key_here
Get your free API key from: https://aistudio.google.com/app/apikey
streamlit run app.py- Upload one or more PDF files from the sidebar
- Click Submit & Process and wait for "Done"
- Type your question in the text box
- Get your answer
| Purpose | Model |
|---|---|
| Embeddings | models/gemini-embedding-001 |
| Answer Generation | gemini-2.5-flash |
- Python 3.9 – 3.12
- Google Gemini API key (free at AI Studio)
- Research paper Q&A
- Notes summarization
- Document search
- Study assistant
- Legal / technical document analysis
- Never commit your
.envfile faiss_index/is generated locally — do not push to GitHub- If your API key is accidentally pushed, regenerate it immediately at AI Studio
- Requires internet connection
- Large PDFs may take a few seconds to process
- FAISS index resets if the folder is deleted
- Chat history across questions
- Support for DOCX and TXT files
- Cloud deployment
- Persistent vector database