A modern Retrieval-Augmented Generation (RAG) application that enables natural language conversations with your PDF documents. Built with SvelteKit, FastAPI, and Ollama for local AI inference.
| Sign In | Sign Up |
|---|---|
![]() |
![]() |
| Home Dashboard | Document Upload |
|---|---|
![]() |
![]() |
| Document Management | Chat Interface |
|---|---|
![]() |
![]() |
| Document Processing |
|---|
![]() |
To develop an intelligent document interaction system that bridges the gap between static document storage and dynamic information retrieval, enabling users to engage in natural language conversations with their document collections for enhanced knowledge discovery and accessibility.
- Democratize Document Access: Make document content accessible through natural language queries, eliminating the need for manual searching and reading
- Local AI Implementation: Provide a privacy-focused solution using local AI models, ensuring data security and independence from cloud services
- Seamless Integration: Create an intuitive web interface that seamlessly integrates document upload, processing, and querying in a unified workflow
- Scalable Architecture: Build a modular system that can handle multiple document formats and scale with user requirements
- Real-time Interaction: Deliver immediate, contextually relevant responses with source attribution for transparency
Traditional document management systems suffer from several limitations:
- Static Storage: Documents are stored but not made truly accessible for dynamic querying
- Search Limitations: Keyword-based search often misses semantic context and relationships
- Information Silos: Knowledge remains locked within documents, requiring manual extraction
- Privacy Concerns: Cloud-based solutions raise data security and privacy issues
- Technical Barriers: Complex AI systems are often inaccessible to non-technical users
The emergence of Large Language Models (LLMs) and vector databases has created new opportunities for intelligent document interaction. However, most existing solutions rely on cloud services or require complex technical setup.
RAG combines the power of information retrieval with generative AI to create contextually aware responses:
Query โ Vector Embedding โ Similarity Search โ Context Retrieval โ LLM Generation โ Response
Key Components:
- Document Chunking: Large documents are split into semantic chunks to optimize retrieval and processing
- Vector Embeddings: Text chunks are converted into high-dimensional vectors that capture semantic meaning
- Similarity Search: User queries are matched against document vectors using cosine similarity
- Context Augmentation: Retrieved chunks provide context to the language model for informed responses
- Source Attribution: Responses include references to original document sources for verification
The system uses SQLite with vector extensions for efficient similarity search:
- Embedding Storage: Document chunks stored as vectors with metadata
- Similarity Metrics: Cosine similarity for semantic matching
- Indexing: Optimized indexing for fast retrieval across large document collections
- User Isolation: Vector spaces partitioned by user for privacy and security
Local LLM integration through Ollama provides:
- Privacy: All processing happens locally, no data leaves the system
- Customization: Model selection based on performance and accuracy requirements
- Streaming: Real-time response generation for better user experience
- Cost Efficiency: No API costs or usage limitations
- ๐ PDF Document Processing - Upload and process PDF files with intelligent text extraction
- ๐ง Conversational AI - Chat with your documents using local LLMs via Ollama
- ๐ Semantic Search - Vector-based similarity search for relevant content retrieval
- ๐ User Authentication - Secure JWT-based authentication system
- โก Real-time Streaming - Live response streaming for better user experience
- ๐ฑ Responsive Design - Works seamlessly across desktop, tablet, and mobile
- ๐ฏ Source Attribution - Responses include references to source documents
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Frontend โ โ Backend โ โ AI Models โ
โ (SvelteKit) โโโโโบโ (FastAPI) โโโโโบโ (Ollama) โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ โ โ
โ โ โ
โผ โผ โผ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Web Browser โ โ Vector Store โ โ Embeddings โ
โ Interface โ โ (SQLite) โ โ Engine โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
ragai/
โโโ frontend/ # SvelteKit frontend application
โ โโโ src/
โ โ โโโ routes/
โ โ โ โโโ +page.svelte
โ โ โ โโโ auth/
โ โ โ โโโ app/
โ โ โโโ lib/
โ โ โ โโโ components/
โ โ โโโ app.html
โ โโโ package.json
โโโ backend/ # FastAPI backend application
โ โโโ main.py # FastAPI application entry
โ โโโ auth.py # Authentication system
โ โโโ pdf_processor.py # PDF processing logic
โ โโโ embeddings.py # Vector embeddings handling
โ โโโ vector_store.py # Vector database operations
โ โโโ uploads/ # Document storage
โ โโโ auth.db # User authentication database
โ โโโ vector_store.db # Vector embeddings database
โโโ README.md
- Python 3.8+
- Node.js 16+
- Ollama installed and running
-
Clone the repository
git clone https://github.com/yourusername/ragai.git cd ragai -
Install and start Ollama
# Install Ollama (if not already installed) curl -fsSL https://ollama.ai/install.sh | sh # Start Ollama service ollama serve # Pull required models ollama pull llama2 ollama pull nomic-embed-text
-
Backend Setup
cd backend pip install -r requirements.txt python main.py -
Frontend Setup
cd frontend npm install npm run dev -
Access the application
Open http://localhost:5173 in your browser
POST /auth/register- User registrationPOST /auth/login- User loginGET /auth/me- Get user profile
POST /upload/- Upload PDF documentsGET /documents/- List user documentsDELETE /documents/{doc_id}- Delete document
POST /query/stream/- Stream RAG responses (with documents)POST /chat/stream/- Stream chat responses (without documents)GET /models/- List available AI models
- Upload - User uploads PDF files through the web interface
- Text Extraction - Backend extracts text content using PyMuPDF
- Chunking - Documents are split into semantic chunks
- Embedding - Chunks are converted to vector embeddings
- Storage - Vectors are stored in SQLite database with metadata
- User Input - User asks a question in natural language
- Query Embedding - Question is converted to vector representation
- Similarity Search - Most relevant document chunks are retrieved
- Context Assembly - Relevant chunks are prepared as context
- AI Generation - Language model generates response using context
- Streaming Response - Answer is streamed back to user with sources
- JWT Authentication - Secure token-based authentication
- User Isolation - Documents are user-specific and isolated
- Input Validation - Comprehensive input sanitization
- File Type Validation - Only PDF uploads allowed
- CORS Configuration - Proper cross-origin request handling
- Responsive Layout - Adapts to mobile, tablet, and desktop
- Real-time Feedback - Loading states and progress indicators
- Error Handling - Graceful error messages and recovery
- Accessibility - Keyboard navigation and screen reader support
- Performance - Optimized for fast loading and smooth interactions
# Backend
cd backend
uvicorn main:app --reload
# Frontend
cd frontend
npm run dev# Frontend
cd frontend
npm run build
# Backend (using gunicorn)
cd backend
gunicorn main:app -k uvicorn.workers.UvicornWorker- Vector Search Optimization - Efficient similarity search algorithms
- Streaming Responses - Non-blocking real-time response delivery
- Caching - Strategic caching of embeddings and responses
- Lazy Loading - Components loaded on demand
- Memory Management - Efficient handling of large documents
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Add tests if applicable
- Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Submit a pull request
This project is open source and available under the MIT License.
- Support for additional document formats (DOCX, TXT, etc.)
- Advanced document preprocessing (OCR, table extraction)
- Multi-language support
- Advanced analytics and insights
- Team collaboration features
- API rate limiting and quotas
- Enhanced security features
If you have any questions or run into issues, please open an issue on GitHub.
RAG AI - Transforming document interaction through intelligent conversation. Upload, ask, and discover insights from your documents like never before.






