A full-stack RAG (Retrieval-Augmented Generation) document assistant powered by FastAPI, React/Vite, Qdrant, and Google Gemini.
Upload PDF documents and ask natural-language questions — the AI retrieves relevant context from your documents and generates grounded answers.
React (Vite) ←→ FastAPI ←→ Qdrant (vector DB)
↕
Google Gemini API
(embeddings + chat)
Pipeline:
- PDF → text extraction (PyMuPDF)
- Text → overlapping chunks
- Chunks → Gemini embeddings → Qdrant
- Query → embed → Qdrant semantic search → top-k chunks
- Chunks + query → Gemini LLM → grounded answer
project/
├── backend/
│ ├── main.py # FastAPI app entry point
│ ├── config.py # Settings (pydantic-settings)
│ ├── requirements.txt
│ ├── Dockerfile
│ ├── .env.example
│ ├── routers/
│ │ ├── documents.py # /upload-pdf, /documents, /document/{id}
│ │ └── chat.py # /ask
│ └── services/
│ ├── pdf_service.py # Extraction + chunking
│ ├── embedding_service.py # Gemini embeddings
│ ├── vector_service.py # Qdrant CRUD
│ └── rag_service.py # RAG pipeline
├── frontend/
│ ├── src/
│ │ ├── App.jsx
│ │ ├── index.css
│ │ ├── components/
│ │ │ ├── ChatPanel.jsx
│ │ │ ├── UploadPanel.jsx
│ │ │ ├── DocumentList.jsx
│ │ │ └── Toast.jsx
│ │ └── services/api.js
│ ├── package.json
│ └── vite.config.js
└── docker-compose.yml
- Python 3.11+
- Node.js 18+
- Docker Desktop (for Qdrant)
- Google Gemini API key → Get one here
-
Clone / open the project folder.
-
Create backend
.env:cp backend/.env.example backend/.env # Edit backend/.env and set GEMINI_API_KEY=your_key_here -
Start backend + Qdrant:
docker-compose up --build
- API: http://localhost:8000
- API docs: http://localhost:8000/docs
- Qdrant dashboard: http://localhost:6333/dashboard
-
Start the frontend:
cd frontend npm install npm run dev
# 1. Start Qdrant via Docker
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant
# 2. Set up Python env
cd backend
python -m venv venv
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activate
pip install -r requirements.txt
# 3. Create .env
copy .env.example .env
# Edit .env and add your GEMINI_API_KEY
# 4. Run backend
uvicorn main:app --reload --port 8000cd frontend
npm install
npm run dev| Variable | Default | Description |
|---|---|---|
GEMINI_API_KEY |
(required) | Google Gemini API key |
QDRANT_HOST |
localhost |
Qdrant host |
QDRANT_PORT |
6333 |
Qdrant port |
COLLECTION_NAME |
knowledge_base |
Qdrant collection name |
CHUNK_SIZE |
800 |
Characters per chunk |
CHUNK_OVERLAP |
150 |
Character overlap between chunks |
TOP_K |
5 |
Number of chunks to retrieve per query |
EMBEDDING_MODEL |
models/text-embedding-004 |
Gemini embedding model |
CHAT_MODEL |
gemini-1.5-flash |
Gemini chat model |
| Method | Endpoint | Description |
|---|---|---|
POST |
/upload-pdf |
Upload and process a PDF |
POST |
/ask |
Ask a question (RAG) |
GET |
/documents |
List all uploaded documents |
DELETE |
/document/{doc_id} |
Delete a document's vectors |
GET |
/health |
Health check |
Interactive API docs: http://localhost:8000/docs
- Open http://localhost:5173
- Drag & drop a PDF onto the upload zone (or click to browse)
- Click Upload & Process — the PDF is chunked, embedded, stored in Qdrant
- Type your question in the chat box and press Enter
- The AI retrieves relevant chunks and returns a grounded answer with source citations
- Manage your documents in the sidebar — delete when no longer needed
- Qdrant data is persisted in a Docker volume (
qdrant_storage) — your documents survive restarts. - When using Docker Compose,
QDRANT_HOSTis automatically set toqdrant(the service name). In local dev, it remainslocalhost.