Production-ready FastAPI backend for document Q&A using RAG (Retrieval-Augmented Generation) with Gemini AI.
- π Document Upload: PDF and DOCX support with validation
- π Semantic Search: FAISS vector database for fast similarity search
- π€ AI-Powered Q&A: Gemini 2.5 Flash for accurate, grounded answers
- π― Source Citations: Returns page numbers and snippets
- π Error Handling: Comprehensive validation and error responses
- π CORS Enabled: Ready for frontend integration
- Framework: FastAPI + Uvicorn
- AI: Google Gemini 2.0 Flash (via google-genai SDK)
- Vector DB: FAISS for embedding storage
- Embeddings: Sentence Transformers (all-MiniLM-L6-v2)
- Document Parsing: PyPDF2, python-docx
cd backend
pip install -r requirements.txtCreate a .env file in the backend directory:
GEMINI_API_KEY=your_gemini_api_key_hereGet your API key from: https://aistudio.google.com/apikey
uvicorn app.main:app --reload --env-file .envThe API will be available at: http://localhost:8000
GET /Response:
{
"status": "Backend running successfully",
"team": "HackNexus",
"service": "AI Document Assistant"
}POST /upload/
Content-Type: multipart/form-dataRequest:
file: PDF or DOCX file (max 20MB)
Response:
{
"message": "Document uploaded and indexed successfully",
"total_chunks": 42
}POST /query/
Content-Type: application/jsonRequest:
{
"question": "What is the refund policy?"
}Response:
{
"answer": "Customers can request a refund within 30 days of purchase.",
"confidence": 0.82,
"sources": [
{
"page": 3,
"snippet": "Refunds accepted within 30 days of purchase..."
}
]
}backend/
βββ app/
β βββ main.py # FastAPI app with CORS
β βββ config.py # Environment & settings
β βββ models/
β β βββ schemas.py # Pydantic models
β βββ routes/
β β βββ upload.py # Document upload endpoint
β β βββ query.py # Q&A endpoint
β βββ services/
β β βββ document_loader.py # PDF/DOCX parsing
β β βββ vector_store.py # FAISS operations
β β βββ qa_engine.py # Gemini RAG pipeline
β βββ utils/
β βββ references.py # Helper functions
βββ data/
β βββ uploads/ # Temporary file storage
β βββ vector_db/ # FAISS index storage
βββ .env # Environment variables
βββ requirements.txt # Python dependencies
βββ README.md # This file
- Allows frontend connections from multiple origins
- Handles preflight requests
- Returns proper headers even on errors
- Validates file types and sizes
- Extracts text page-by-page
- Creates overlapping chunks for better context
- Generates 384-dim embeddings
- Stores in FAISS for fast retrieval
- Persists to disk automatically
- Question β Embedding
- Similarity search β Top 3 chunks
- Build context with page numbers
- Gemini generates grounded answer
- Return with confidence & sources
Make sure your frontend URL is listed in main.py CORS middleware.
- Check your API key in
.env - Verify internet connection
- Check quota limits at Google AI Studio
- Check file size (<20MB)
- Verify file extension (.pdf or .docx)
- Check backend logs for errors
- Upload a document first
- Check if FAISS index was created in
data/vector_db/
| Variable | Required | Description |
|---|---|---|
GEMINI_API_KEY |
β Yes | Google Gemini API key |
uvicorn app.main:app --reload --env-file .env --log-level debugAll operations print to console with emoji indicators:
- β Success
- β Error
β οΈ Warning- π Processing
- π¦ Loading
For production, consider:
- Use proper secrets management
- Add authentication
- Implement rate limiting
- Add request logging
- Use PostgreSQL for metadata
- Deploy with Docker
MIT