This is a modular RAG (Retrieval-Augmented Generation) system structured for Modal deployment.
Rag_Code/
├── main.py # Main Modal application (deploy this to Modal)
├── src/ # Source package
│ ├── __init__.py
│ ├── config.py # Configuration constants and models
│ ├── models.py # Request/Response Pydantic models
│ ├── document_processing.py # Document extraction and processing
│ ├── rag_engine.py # RAG processing logic
│ └── helpers/ # Helper utilities
│ ├── __init__.py
│ └── utils.py # Utility classes (logging, timing, email, API client)
├── requirements.txt # Python dependencies
├── example_usage.py # Example usage script
└── README.md # This file
# Install Modal CLI
pip install modal
# Deploy the application
modal deploy main.py# Install dependencies
pip install -r requirements.txt
# Run example
python example_usage.pyOnce deployed to Modal, you'll get a URL like https://your-username--hackrx-rag-optimized-main.modal.run
First, upload and process your document:
import requests
# Upload document
upload_payload = {
"document_url": "https://example.com/document.pdf",
"config": {
"chunking": {"chunk_size": 1024, "overlap": 250},
"retrieval": {"semantic_search_top_k": 25, "bm25_search_top_k": 25},
"generation": {"max_tokens": 400, "temperature": 0.1}
},
"email": {
"enabled": True,
"to_email": "your-email@example.com" # Optional: send logs to this email
}
}
upload_response = requests.post(
"https://your-username--hackrx-rag-optimized-main.modal.run/hackrx/upload",
json=upload_payload,
headers={"Authorization": "Bearer YOUR_SECRET_TOKEN"}
)
session_data = upload_response.json()
session_id = session_data["session_id"]
print(f"Document processed! Session ID: {session_id}")
print(f"Chunks created: {session_data['chunks_count']}")Then ask questions using the session ID:
# Ask questions
chat_payload = {
"session_id": session_id,
"questions": ["What is the main topic?", "What are the key findings?"],
"config": {
"chunking": {"chunk_size": 1024, "overlap": 250},
"retrieval": {"semantic_search_top_k": 25, "bm25_search_top_k": 25},
"generation": {"max_tokens": 400, "temperature": 0.1}
},
"email": {
"enabled": True,
"to_email": "your-email@example.com" # Optional: send logs to this email
}
}
chat_response = requests.post(
"https://your-username--hackrx-rag-optimized-main.modal.run/hackrx/chat",
json=chat_payload,
headers={"Authorization": "Bearer YOUR_SECRET_TOKEN"}
)
answers = chat_response.json()["answers"]
print("Answers:", answers)You can still use the original single endpoint for backward compatibility:
legacy_payload = {
"documents": "https://example.com/document.pdf",
"questions": ["What is the main topic?", "What are the key findings?"],
"config": {
"chunking": {"chunk_size": 1024, "overlap": 250},
"retrieval": {"semantic_search_top_k": 25, "bm25_search_top_k": 25},
"generation": {"max_tokens": 400, "temperature": 0.1}
},
"email": {
"enabled": True,
"to_email": "your-email@example.com" # Optional: send logs to this email
}
}
response = requests.post(
"https://your-username--hackrx-rag-optimized-main.modal.run/hackrx/run",
json=legacy_payload,
headers={"Authorization": "Bearer YOUR_SECRET_TOKEN"}
)
answers = response.json()["answers"]- Multi-format Document Support: PDF, DOCX, PowerPoint, Excel, images, text, CSV
- Hybrid Search: Combines semantic search with BM25 keyword search
- Configurable Parameters: All components are configurable via Pydantic models
- Email Logging: Comprehensive logs sent via email
- Performance Optimization: Parallel processing and resource management
- Modal Deployment: Ready for cloud deployment with GPU support
All configuration is managed through src/config.py:
- ChunkingConfig: Text chunking parameters
- RetrievalConfig: Search and retrieval parameters
- RRFConfig: Reciprocal Rank Fusion weights
- GenerationConfig: Answer generation parameters
- PerformanceConfig: Performance and concurrency settings
POST /hackrx/upload- Upload and process document (returns session ID)POST /hackrx/chat- Ask questions on uploaded document (uses session ID)
POST /hackrx/run- Legacy single-endpoint RAG processing
GET /hackrx/config/default- Get default configurationGET /hackrx/health- Health check
You can now specify email settings in your requests:
"email": {
"enabled": True, # Set to False to disable email logs
"to_email": "your-email@example.com" # Optional: specify recipient
}- If
enabledisTrueandto_emailis provided, logs will be sent to that email - If
enabledisTruebutto_emailis not provided, logs will be sent to the default email (if configured) - If
enabledisFalseor not provided, no emails will be sent