Transform your corporate documents into an intelligent conversational engine.
Enterprise Knowledge Assistant (EKA) is a production-grade Retrieval-Augmented Generation (RAG) platform designed for mid-market and enterprise-scale knowledge management. It enables organizations to ingest massive volumes of internal documentsβPDFs, SOPs, FAQs, and manualsβand turn them into a searchable, citation-backed AI expert.
- Project Description
- Documentation
- Business Problem
- Key Features
- Architecture Overview
- System Workflow
- Tech Stack
- Project Structure
- Installation Guide
- Environment Variables
- Running Locally
- Docker Setup
- API Endpoints
- RAG Pipeline Explanation
- Retrieval Strategy
- Evaluation Framework
- Screenshots
- Demo Video
- Deployment Guide
- Roadmap
- Performance Metrics
- Security Considerations
- Future Improvements
- Lessons Learned
- Author
EKA serves as a centralized "brain" for enterprise data. Unlike generic LLMs, EKA operates exclusively on your organization's proprietary data, ensuring high-accuracy responses that are grounded in reality. It combines advanced semantic search with state-of-the-art language models to provide answers that include direct citations to the source documents.
Detailed engineering documentation can be found in the docs/ directory:
- Technical Design Document (TDD): Deep dive into the system design and components.
- Architecture Overview: High-level diagrams and data flow.
- API Specification: Complete REST API documentation.
- RAG Evaluation Framework: Methodology for measuring RAG performance.
- Deployment Guide: Instructions for local, Docker, and Cloud deployment.
- Product Roadmap: Detailed development plan and future vision.
- Interview & FAQ: 50+ interview questions and answers related to this project.
- Architectural Decision Records (ADR):
In modern enterprises, information is often siloed across multiple platforms (Google Drive, SharePoint, Notion, etc.). This leads to:
- Information Silos: Critical knowledge is locked in fragmented systems.
- Knowledge Loss: Valuable insights disappear when employees leave the company.
- Slow Retrieval: Employees spend up to 20% of their time just looking for information.
- Inconsistent Answers: Different departments provide conflicting information based on outdated documents.
EKA solves this by providing a "Single Source of Truth" that is accessible in seconds through a natural language interface.
- π Multi-Format Ingestion: Seamlessly upload and index PDF, DOCX, and TXT files.
- π Hybrid Search: Combines semantic vector search with keyword-based retrieval for maximum precision.
- π Smart Citations: Every answer comes with verifiable links and references to the exact page and paragraph.
- π‘οΈ Enterprise Security: Role-Based Access Control (RBAC) and SOC2-ready architecture.
- π Analytics Dashboard: Track popular questions, knowledge gaps, and system performance.
- π Re-indexing Pipeline: Automatically update knowledge base when documents are modified.
The system follows a modern microservices-inspired architecture designed for scalability and reliability.
graph TD
User((User)) -->|React Frontend| FE[Next.js Application]
FE -->|REST API| BE[FastAPI Backend]
subgraph "Core Services"
BE --> Auth[Auth Service]
BE --> Doc[Document Service]
BE --> Chat[Chat Service]
end
subgraph "RAG Engine"
Doc --> Parser[Text Extraction]
Parser --> Chunk[Chunking Engine]
Chunk --> Embed[Embedding Service]
Embed --> VDB[(pgvector / Vector DB)]
end
subgraph "LLM Layer"
Chat --> Retrieval[Retrieval Service]
Retrieval --> VDB
Retrieval --> Prompt[Prompt Builder]
Prompt --> LLM[OpenAI / Gemini]
LLM --> Response[Response + Citations]
end
BE --> DB[(PostgreSQL)]
BE --> Cache[(Redis)]
Doc --> S3[(S3 Storage)]
EKA uses a decoupled storage architecture:
- Binary Files: Stored in object storage (MinIO for local development, AWS S3 for production).
- Metadata & Status: Managed in PostgreSQL via SQLAlchemy models.
- Vectors: Stored in PostgreSQL with
pgvector.
The StorageService in backend/app/services/storage.py provides an async abstraction for all storage operations, while DocumentService manages metadata and processing status transitions.
Ensure the following variables are set in your .env file:
S3_ACCESS_KEY=minioadmin
S3_SECRET_KEY=minioadmin
S3_BUCKET=eka-documents
S3_ENDPOINT=http://localhost:9000Local MinIO console is available at http://localhost:9001.
- Upload: User uploads a PDF/Document via
POST /api/v1/documents. - Storage: System uploads the raw file to S3/MinIO and creates a
Documentrecord in PostgreSQL withUPLOADEDstatus. - Async Processing: A Celery task (
process_document_task) is triggered to handle processing in the background:- Extraction: Extracts text and metadata (Task-019).
- Chunking: Splits text into optimized segments (Task-020).
- Embedding: Generates vectors (Task-021).
- Storage: Persists chunks/vectors to
pgvector(Task-022).
- Completion: Document status is updated to
COMPLETEDorFAILED.
- Input: User asks a natural language question.
- Retrieval: System converts the query to a vector and finds the top-K most relevant chunks.
- Augmentation: Chunks are injected into a specialized prompt.
- Generation: LLM generates a grounded answer based only on the provided context.
- Verification: System validates citations before presenting the final response.
- Frontend: React, Next.js, Tailwind CSS, Lucide Icons.
- Backend: FastAPI (Python 3.10+), Pydantic, SQLAlchemy.
- Database: PostgreSQL with
pgvectorextension. - Caching/Task Queue: Redis, Celery.
- AI/ML: OpenAI GPT-4o / Google Gemini Pro, LangChain, Sentence-Transformers.
- Infrastructure: Docker, Docker Compose, Nginx.
- DevOps: GitHub Actions, Prometheus, Grafana.
.
βββ backend/ # FastAPI Application
β βββ app/
β β βββ api/ # API Routes
β β βββ core/ # Configuration & Security
β β βββ models/ # SQLAlchemy Models
β β βββ services/ # Business Logic (RAG, Doc Processing)
β β βββ schemas/ # Pydantic Schemas
β β βββ tasks/ # Celery Background Tasks
β β βββ worker.py # Celery Worker Entry Point
β βββ tests/ # Pytest Suite
β βββ main.py # Entry Point
βββ frontend/ # Next.js Application
...
β β βββ components/ # UI Components β β βββ hooks/ # Custom React Hooks β β βββ pages/ # Application Routes βββ docs/ # Documentation (ADR, PRD, API Spec) βββ docker/ # Docker Configuration βββ tests/ # Integration & E2E Tests βββ docker-compose.yml # Orchestration
## π Quick Start
Get started with EKA in minutes:
```bash
# 1. Clone the repository
git clone https://github.com/NvlFR/enterprise-knowledge-assistant.git
cd enterprise-knowledge-assistant
# 2. Install dependencies (creates .venv automatically)
make install
# 3. Setup environment variables
cp backend/.env.example backend/.env
# 4. Run the backend
source .venv/bin/activate
uvicorn backend.app.main:app --reload
Create a .env file in both backend/ and frontend/ directories.
Backend (.env):
DATABASE_URL=postgresql://user:password@localhost:5432/eka_db
REDIS_URL=redis://localhost:6379/0
OPENAI_API_KEY=sk-your-key-here
GEMINI_API_KEY=your-gemini-key
SECRET_KEY=your-super-secret-keyFrontend (.env.local):
NEXT_PUBLIC_API_URL=http://localhost:8000/api/v1cd backend
uvicorn app.main:app --reloadcd frontend
npm run devFor a production-ready local environment:
docker-compose up --buildThis will spin up:
- API:
http://localhost:8000 - Frontend:
http://localhost:3000 - Database:
localhost:5432 - Redis:
localhost:6379
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/v1/login/access-token |
Authenticate user & get JWT |
POST |
/api/v1/documents |
Upload new document |
GET |
/api/v1/documents |
List all indexed documents |
GET |
/api/v1/documents/{id} |
Get document details |
DELETE |
/api/v1/documents/{id} |
Delete document |
POST |
/api/v1/chat |
Send a query to the RAG engine |
GET |
/api/v1/chat/history |
Retrieve conversation history |
Our RAG implementation goes beyond simple vector search:
- Query Transformation: We use LLMs to rephrase user queries for better retrieval performance.
- Hierarchical Chunking: We maintain relationship between small chunks and their parent sections to provide context-aware answers.
- Reranking: After the initial vector search, we use a Cross-Encoder Reranker (BGE-Reranker) to select the absolute best context for the LLM.
- Embeddings:
text-embedding-3-large(OpenAI) for high-dimensional semantic capture. - Search Type: Hybrid (Vector + BM25) to handle both semantic meaning and specific keyword matching.
- Top-K: 20 initial candidates reduced to 5 via reranking.
- Overlap: 15% chunk overlap to prevent context loss at boundaries.
We use the RAGAS (RAG Assessment) framework to measure:
- Faithfulness: Does the answer match the retrieved context?
- Answer Relevance: Does the answer address the user's query?
- Context Precision: Are the retrieved chunks truly relevant?
- Hallucination Rate: Target is < 5% for production readiness.
- Infrastructure: AWS (EKS/ECS) or GCP (GKE).
- CI/CD: GitHub Actions for automated testing and container builds.
- Database: Managed RDS with
pgvectorsupport. - Storage: AWS S3 for raw document storage.
- MVP with PDF support & Basic RAG
- Citation tracking
- Hybrid Search integration
- Google Drive & SharePoint Connectors
- Multimodal RAG (Image support in documents)
- Knowledge Graph (GraphRAG) integration
- Avg. Retrieval Time: < 300ms
- Avg. Time to First Token: < 1.2s
- Accuracy (Internal Benchmark): 92%
- Max Document Size: 500MB per file
- Data Privacy: Documents are encrypted at rest (AES-256) and in transit (TLS 1.3).
- Prompt Injection: Robust input sanitization and guardrail layers.
- Audit Logs: Every query and document access is logged for compliance.
- PII Redaction: Automatic detection and masking of sensitive information in logs.
- Agentic RAG: Implementing autonomous agents that can decide when to search, calculate, or browse.
- Fine-Tuning: Domain-specific embedding fine-tuning for specialized industries (Legal/Medical).
- Collaborative AI: Shared chat sessions for team-based knowledge discovery.
- Chunking Matters: Fixed-size chunking is often insufficient; semantic-aware chunking significantly improves retrieval.
- Reranking is Non-Negotiable: Vector search alone has too much "noise" for sensitive enterprise use cases.
- Citation UX: Users trust the system 3x more when they can see the original PDF snippet alongside the answer.
Noval Faturrahman
- GitHub: @NvlFR
- LinkedIn: novalfaturrahman-ai
- Portfolio: noval.faturrahman.ai
Built with β€οΈ by the Enterprise Knowledge Assistant Team.