An intelligent Retrieval-Augmented Generation (RAG) chatbot that answers questions based on your uploaded documents. Features advanced document processing, vector similarity search, and multiple AI language model integrations.
- Document Upload & Processing: PDF, Word, Text, CSV, HTML, JSON support
- Intelligent Search: Vector-based semantic similarity search
- AI-Powered Responses: Multiple LLM providers (OpenAI, Claude, Gemini, Cohere, HuggingFace)
- Real-time Chat: WebSocket streaming responses
- Source Attribution: See which documents were used for answers
- Beautiful Interface: Modern, responsive web UI
- Format Support: PDF, DOCX, TXT, MD, CSV, HTML, JSON
- Text Extraction: Advanced parsing for all supported formats
- Chunking: Intelligent text segmentation with overlap
- Embeddings: Vector representations for semantic search
- Multiple Providers: OpenAI GPT, Anthropic Claude, Google Gemini, Cohere, HuggingFace
- Fallback System: Works without API keys using free models
- Streaming: Real-time response generation
- Context-Aware: Uses retrieved document chunks for accurate answers
- Drag & Drop Upload: Easy document management
- Live Chat: Real-time conversation with typing indicators
- Source Preview: Click sources to see relevant content
- Settings Panel: Customize search parameters
- Responsive Design: Works on desktop and mobile
# Clone or download the project
git clone <repository-url>
cd rag-qa-chatbot
# Install dependencies
npm install
# Run setup (creates .env file and sample documents)
npm run setupEdit .env file to add your API keys for enhanced AI capabilities:
# OpenAI (GPT models) - Get free credits at platform.openai.com
OPENAI_API_KEY=your_openai_api_key_here
# Anthropic Claude - Get free credits at console.anthropic.com
ANTHROPIC_API_KEY=your_anthropic_api_key_here
# Google Gemini - Free tier available at makersuite.google.com
GOOGLE_API_KEY=your_google_api_key_here
# Cohere - Free tier available at cohere.ai
COHERE_API_KEY=your_cohere_api_key_here
# Hugging Face (for free models) - Sign up at huggingface.co
HUGGINGFACE_API_KEY=your_huggingface_api_key_hereNote: The system works out-of-the-box with free HuggingFace models, no API keys required!
# Production mode
npm start
# Development mode (auto-reload)
npm run devNavigate to: http://localhost:3000
- Drag and drop files or click the upload area
- Supported formats: PDF, Word, Text, CSV, HTML, JSON
- Files are automatically processed and indexed
- Type questions about your uploaded documents
- Use natural language queries
- Get AI-powered answers with source citations
"What are the main points in this document?"
"Summarize the key findings"
"What recommendations are mentioned?"
"Compare different approaches discussed"
"Find information about [specific topic]"
- Use predefined quick action buttons
- Summarize documents
- Find key insights
- Compare approaches
- Express.js Server: RESTful API and WebSocket support
- Document Processor: Multi-format text extraction
- Vector Store: In-memory similarity search
- Embedding Service: Text vectorization
- LLM Service: Multiple AI provider integration
- Retrieval Service: Context-aware document search
- Modern Web UI: HTML5, CSS3, JavaScript
- Real-time Chat: WebSocket communication
- File Upload: Drag & drop with progress
- Responsive Design: Mobile-friendly interface
- Document Upload → Text extraction and cleaning
- Text Chunking → Intelligent segmentation with overlap
- Embedding Generation → Vector representations
- Vector Storage → In-memory indexing
- Query Processing → Similarity search
- Response Generation → AI-powered answers
# Server Configuration
PORT=3000
NODE_ENV=development
# LLM Provider (choose your preferred)
DEFAULT_LLM_PROVIDER=huggingface # openai, anthropic, google, cohere
# Document Processing
CHUNK_SIZE=500 # Text chunk size
CHUNK_OVERLAP=50 # Overlap between chunks
MAX_DOCUMENTS=1000 # Maximum documents to store
# Retrieval Settings
TOP_K_RESULTS=5 # Number of results to retrieve
SIMILARITY_THRESHOLD=0.7 # Minimum similarity score
MAX_CONTEXT_LENGTH=4000 # Maximum context for LLM
# Rate Limiting
RATE_LIMIT_REQUESTS=100 # Requests per window
RATE_LIMIT_WINDOW=900000 # Window in milliseconds (15 min)| Format | Extension | Description |
|---|---|---|
.pdf |
Portable Document Format | |
| Word | .doc, .docx |
Microsoft Word documents |
| Text | .txt |
Plain text files |
| Markdown | .md |
Markdown formatted text |
| CSV | .csv |
Comma-separated values |
| HTML | .html |
Web pages |
| JSON | .json |
JavaScript Object Notation |
POST /api/chat
Content-Type: application/json
{
"message": "Your question here",
"sessionId": "optional_session_id",
"options": {
"topK": 5,
"threshold": 0.7
}
}# Upload document
POST /api/documents/upload
Content-Type: multipart/form-data
# Get all documents
GET /api/documents/
# Delete document
DELETE /api/documents/{documentId}
# Search within document
POST /api/documents/{documentId}/search# System status
GET /api/admin/status
# Change LLM provider
POST /api/admin/llm/provider// Send chat message
{
"type": "chat",
"data": {
"message": "Your question",
"sessionId": "session_id",
"options": { "topK": 5 }
}
}
// Stream chat (real-time response)
{
"type": "stream_chat",
"data": { /* same as above */ }
}// Chat response
{
"type": "chat_response",
"data": {
"message": "AI response",
"sources": [...],
"metadata": {...}
}
}
// Streaming tokens
{
"type": "stream_token",
"token": "word "
}- Edit
public/index.htmlfor structure - Modify styles in the
<style>section - Uses Tailwind CSS for responsive design
Add new LLM providers in src/services/llmService.js:
async yourProviderGenerate(prompt, options) {
// Implement your provider logic
return {
text: response,
provider: 'your-provider',
model: 'your-model'
};
}Add new file format support in src/services/documentProcessor.js:
async extractYourFormat(buffer) {
// Implement format-specific extraction
return extractedText;
}-
Files not uploading
- Check file size (max 50MB)
- Verify file format is supported
- Check network connection
-
AI not responding
- Verify API keys in
.env - Check provider status
- Try different LLM provider
- Verify API keys in
-
Search not finding content
- Adjust similarity threshold
- Check document was processed
- Try different search terms
-
WebSocket connection issues
- Check browser console for errors
- Verify server is running
- Check firewall settings
NODE_ENV=development npm run dev- Document Chunking: Adjust
CHUNK_SIZEfor your content - Search Results: Tune
TOP_K_RESULTSandSIMILARITY_THRESHOLD - Memory Usage: Monitor with
MAX_DOCUMENTSsetting - API Costs: Use free providers for development
- Vector Database: Consider Pinecone, Weaviate for production
- File Storage: Add cloud storage for document persistence
- Caching: Implement Redis for session management
- Load Balancing: Use multiple server instances
# Fork the repository
git clone <your-fork>
cd rag-qa-chatbot
# Install dependencies
npm install
# Start development server
npm run dev
# Make your changes and test
# Submit a pull request- Additional file format support
- New LLM provider integrations
- Enhanced UI components
- Performance optimizations
- Documentation improvements
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for GPT models
- Anthropic for Claude
- Google for Gemini
- Cohere for language models
- HuggingFace for free AI models
- Tailwind CSS for styling
- Express.js for backend framework
Built with ❤️ for the AI community
Need help? Open an issue or join our discussions!