A production-ready Retrieval-Augmented Generation (RAG) chatbot with full duplex voice conversations, email notifications, long-term memory, and advanced UI features - completely FREE using Groq API.
- 100% FREE - Uses Groq's free API (70k tokens/day with Llama 3.1 8B Instant)
- Full Duplex Voice Chat - Speak while AI responds, interrupt anytime, natural conversations
- Email Notifications - Get notified via email for chat events
- Long-Term Memory - Remembers facts, preferences, and conversation context
- Chat Export - Export conversations in JSON, TXT, Markdown, or HTML
- Smart UI - Context-based emojis, code highlighting, typing animation
# Clone and install
git clone https://github.com/Naieem-55/chat-bot
cd chatbot
pip install -r requirements.txt
# Configure (get FREE key from https://console.groq.com/keys)
cp .env.example .env
# Edit .env and add: HUGGINGFACE_API_KEY=your_groq_api_key
# Run
uvicorn src.api.main:app --host 0.0.0.0 --port 8002 --reloadOpen frontend/index.html in your browser and start chatting!
- RAG Pipeline: Retrieval-Augmented Generation for context-aware responses
- FAISS Vector Store: Fast semantic search with FAISS
- BM25 Hybrid Search: Combines semantic + keyword search
- Query Reformulation: Improves search accuracy
- Session Management: Multi-turn conversations with history
- Document Ingestion: Upload and index PDF, TXT, HTML, MD files
- Hallucination Detection: Flags potentially inaccurate responses
- Full Duplex Voice: Real-time 2-way voice conversations
- Voice Input/Output: Speak queries and hear responses
- Chat Export: Export in 4 formats (JSON, TXT, MD, HTML)
- Pin Conversations: Keep important chats at the top
- Context Emojis: Auto-adds relevant emojis
- Code Highlighting: Beautiful code blocks with copy button
User Query → Voice/Text Input → Embedding → Hybrid Search (Vector + BM25)
↓
Context Retrieval + Memory
↓
Groq API (FREE Llama 3.1)
↓
Response + Voice Output + Notifications
See ARCHITECTURE.md for detailed system design.
chatbot/
├── src/
│ ├── api/main.py # FastAPI application
│ ├── data_ingestion/ # Document loading & processing
│ ├── vector_store/ # FAISS vector store
│ ├── retrieval/ # Hybrid retrieval (BM25 + semantic)
│ ├── llm/ # Groq/Claude API clients
│ ├── session/ # Session management
│ ├── memory/ # Long-term memory system
│ ├── feedback/ # Feedback & hallucination detection
│ └── notifications/ # Email & webhook system
├── frontend/
│ ├── index.html # Main chat UI
│ ├── style.css # Dark theme styling
│ └── chat.js # Chat logic
├── data/
│ ├── documents/ # Your documents
│ ├── vector_store/ # FAISS index
│ └── memory/ # Memory storage
└── requirements.txt
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Health check |
/session/create |
POST | Create new session |
/chat |
POST | Send message |
/sessions/list |
GET | List all sessions |
/session/{id}/history |
GET | Get conversation history |
| Endpoint | Method | Description |
|---|---|---|
/memory/{id}/facts |
GET | Get stored facts |
/feedback |
POST | Submit feedback |
/feedback/stats |
GET | Get feedback statistics |
| Endpoint | Method | Description |
|---|---|---|
/documents/upload |
POST | Upload document |
/documents/list |
GET | List documents |
/webhooks/email/configure |
POST | Configure email |
/webhooks/create |
POST | Create webhook |
Key options in .env:
| Variable | Description | Default |
|---|---|---|
HUGGINGFACE_API_KEY |
Groq API key (FREE) | Required |
TOP_K_DOCUMENTS |
Documents to retrieve | 5 |
CHUNK_SIZE |
Text chunk size | 500 |
EMBEDDING_MODEL |
Sentence transformer | BAAI/bge-base-en-v1.5 |
SMTP_SERVER |
Email server | smtp.gmail.com |
- Store API keys in environment variables (never commit
.env) - Validate all user inputs server-side
- Sanitize file uploads (limit size, check MIME types)
- Use HTTPS in production
- Groq API: 70k tokens/day (free tier)
- Implement request rate limiting (recommended: 10 req/min per user)
- Session IDs are UUID v4 (cryptographically secure)
- Sessions stored server-side only
- Consider adding user authentication for production
- Enable HTTPS only
- Add user authentication
- Implement rate limiting middleware
- Configure CORS properly
- Set up request logging
- Enable CSRF protection
- Use Redis for session storage
# Test health endpoint
curl http://localhost:8002/health
# Test chat
curl -X POST http://localhost:8002/chat \
-H "Content-Type: application/json" \
-d '{"message": "Hello", "session_id": "test-session"}'- "How do I track my order?"
- "What is your return policy?"
- "What payment methods do you accept?"
- Click the phone button
- Say: "What payment methods do you accept?"
- Try interrupting mid-response
- Continue conversation naturally
| Metric | Value |
|---|---|
| Vector search | <50ms |
| BM25 search | <20ms |
| LLM generation | 1-3 seconds |
| Total response | 1.5-3.5 seconds |
- Reduce
TOP_K_DOCUMENTSfor faster responses - Use smaller chunk sizes for quicker retrieval
- Enable BM25 for better keyword matching
| Issue | Solution |
|---|---|
| Backend won't start | Check if port 8002 is in use: netstat -ano | findstr :8002 |
| Voice not working | Use Chrome/Edge; enable microphone permissions |
| Email not sending | Use Gmail App Password, not regular password |
| Memory not persisting | Check data/memory/ directory permissions |
Contributions welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/new-feature) - Write tests for new functionality
- Commit your changes (
git commit -m 'Add new feature') - Push to the branch (
git push origin feature/new-feature) - Open a Pull Request
# Install dev dependencies
pip install -r requirements.txt
pip install pytest pytest-asyncio httpx
# Run tests
pytest tests/- Follow PEP 8 guidelines
- Add docstrings to functions
- Type hints are encouraged
- Voice chat requires Chrome/Edge (Firefox has limited support)
- Groq free tier: 70k tokens/day limit
- Memory storage is file-based (consider Redis for production)
- No built-in user authentication
MIT License - free for commercial and personal use.
- Groq for providing FREE LLM API
- FAISS team for vector search
- Sentence Transformers community