🧠 DocsInsight Engine: Enterprise RAG System

DocsInsight Engine is a high-performance, private Retrieval-Augmented Generation (RAG) platform. It allows users to upload complex documents and interact with them through a neural search interface powered by local Large Language Models (LLMs).

✨ Key Features

📁 Multi-Format Support: Seamlessly process PDF, DOCX, XLSX, CSV, and TXT files.
🔒 Privacy-Centric: Fully local execution using Ollama. Your sensitive data never leaves your infrastructure.
⚡ Neural Retrieval: Uses ChromaDB for high-speed vector similarity search.
🎨 Modern Interface: A sleek, dark-themed "Glassmorphism" UI with real-time markdown rendering and code highlighting.
🛠️ Source Verification: Every answer comes with citations from the uploaded documents to prevent hallucinations.
🐳 One-Command Setup: Ready for production with Docker and Docker Compose.

🏗️ Technical Architecture

Backend Stack

Core: Python 3.11 with Flask.
Orchestration: LangChain for managing document loaders, splitters, and LLM chains.
Vector Database: ChromaDB for persistent document embeddings.
LLM/Embeddings: Llama 3 (8B) via Ollama.

Frontend Stack

UI: Vanilla JS with CSS Mesh Gradients and Backdrop Filters.
Rendering: Marked.js for markdown and Highlight.js for code snippets.

🚀 Getting Started

Prerequisites

Docker and Docker Compose.
Ollama installed and running on your host machine.
Pull the required model: ollama pull llama3:8b.

Installation

Clone the repository:

git clone [https://github.com/arfazrll/rag-docsinsight-engine.git](https://github.com/arfazrll/rag-docsinsight-engine.git)
cd rag-docsinsight-engine

Launch with Docker:
```
docker-compose up --build
```
Access the App: Open your browser and navigate to http://localhost:5000.

📂 Project Structure

├── backend/
│   ├── app.py          # Flask API Endpoints
│   └── rag_core.py     # RAG Logic, Vector Store & Document Processing
├── web/
│   ├── index.html      # Frontend Structure
│   ├── style.css       # Glassmorphism Styling
│   └── script.js       # Client-side Logic
├── storage/            # Local Persistent Storage (Vector DB & Docs)
├── Dockerfile          # Container Configuration
└── docker-compose.yml  # Service Orchestration

🛠️ Configuration

The system uses environment variables to communicate with the AI engine. In docker-compose.yml:

Ollama_BASE_URL: Defaults to http://host.docker.internal:11434 for local communication.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

💡 System Insights

Scalability: The VectorStoreManager is designed to handle multiple documents simultaneously by filtering searches based on unique file hashes.
Performance: Document chunking is optimized with a 1000 character size and 200 character overlap to maintain context window efficiency.
Security: Includes a .dockerignore and .gitignore to prevent sensitive credentials (.env) or local databases from being leaked.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🧠 DocsInsight Engine: Enterprise RAG System

✨ Key Features

🏗️ Technical Architecture

Backend Stack

Frontend Stack

🚀 Getting Started

Prerequisites

Installation

📂 Project Structure

🛠️ Configuration

📄 License

💡 System Insights

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

🧠 DocsInsight Engine: Enterprise RAG System

✨ Key Features

🏗️ Technical Architecture

Backend Stack

Frontend Stack

🚀 Getting Started

Prerequisites

Installation

📂 Project Structure

🛠️ Configuration

📄 License

💡 System Insights