Skip to content

ayebkalil/ai-tutoring-system

Repository files navigation

AI Tutoring System

An intelligent tutoring system powered by Retrieval-Augmented Generation (RAG) that provides Q&A functionality and automatic quiz generation from PDF documents.

Features

  • 📚 Document Management: Upload and manage PDF documents
  • 💬 Q&A System: Ask questions about uploaded documents using RAG with real-time HTMX updates
  • 📝 QCM Generation: Automatically generate multiple-choice questions (MCQ) from documents
  • 📊 White Tests: Create comprehensive tests covering multiple topics
  • 🌐 Bilingual Support: Supports both English and French
  • 🚀 GPU Optimized: 4-bit quantization for efficient model loading
  • 🎨 Modern UI: Glass morphism design with gradient backgrounds and smooth animations
  • Real-time Updates: HTMX-powered chat interface for instant responses
  • 📱 Responsive Design: Works seamlessly on desktop and mobile devices

Architecture

  • Backend: FastAPI with RAG pipeline

    • Mistral-7B-Instruct for Q&A
    • Phi-3-mini for QCM generation
    • BGE-M3 for embeddings
    • FAISS for vector storage
  • Frontend: Django web application

Prerequisites

  • Python 3.11+
  • CUDA-capable GPU (recommended, but CPU works too)
  • 16GB+ RAM (32GB recommended)
  • At least 20GB free disk space for models

Quick Start

Run the automated setup script:

Windows:

setup.bat

This will create all virtual environments, install dependencies, and run initial migrations.

Recent Improvements

  • Automated Setup: Added setup.bat script for Windows to automate environment creation and dependency installation
  • Fixed Dependencies: Added missing spacy package to backend requirements
  • Enhanced Error Handling: Improved exception handling across all API endpoints
  • Basic Testing: Added initial test suite for API endpoints
  • Modern UI/UX: Complete frontend redesign with:
    • Glass morphism effects and gradient backgrounds
    • HTMX for real-time chat updates without page refreshes
    • Enhanced loading indicators and animations
    • Responsive design with modern styling
    • Improved accessibility and user feedback

Installation

Manual Setup

If you prefer manual setup, follow these steps:

Backend Setup

  1. Clone the repository:
cd backend
  1. Create virtual environment:
python -m venv venv_backend
source venv_backend/bin/activate  # On Windows: venv_backend\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Configure environment:
cp .env.example .env
# Edit .env with your preferred settings
  1. Download models (first run will download automatically): The models will be downloaded from HuggingFace on first run:
  • mistralai/Mistral-7B-Instruct-v0.3 (~14GB)
  • microsoft/Phi-3-mini-4k-instruct (~7GB)
  • BAAI/bge-m3 (~600MB)
  • BAAI/bge-reranker-v2-m3 (~400MB)
  1. Create data folder:
mkdir data
# Add your PDF files to the data folder
  1. Run the backend:
uvicorn app.main:app --reload --host 0.0.0.0 --port 8001

The API will be available at http://localhost:8001

API Documentation: http://localhost:8001/docs

Frontend Setup

  1. Navigate to frontend:
cd frontend
  1. Create virtual environment:
python -m venv venv_frontend
source venv_frontend/bin/activate  # On Windows: venv_frontend\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Configure backend URL: Update core/utils.py with your backend API URL if different from default.

  2. Run migrations:

python manage.py migrate
  1. Create superuser (optional):
python manage.py createsuperuser
  1. Run the frontend:
python manage.py runserver

Frontend will be available at http://localhost:8000

Configuration

Environment Variables

Key configuration options in backend/.env:

  • DATA_FOLDER: Folder containing PDF documents (default: data)
  • MAX_FILE_SIZE_MB: Maximum file upload size (default: 10)
  • USE_4BIT: Use 4-bit quantization (default: true)
  • DEVICE: Device selection - auto, cuda, or cpu (default: auto)
  • ALLOWED_ORIGINS: CORS allowed origins (comma-separated)

Model Configuration

You can change models in .env:

  • QA_MODEL: Model for Q&A (default: mistralai/Mistral-7B-Instruct-v0.3)
  • QCM_MODEL: Model for quiz generation (default: microsoft/Phi-3-mini-4k-instruct)
  • EMBEDDING_MODEL: Embedding model (default: BAAI/bge-m3)
  • RERANKER_MODEL: Reranking model (default: BAAI/bge-reranker-v2-m3)

API Endpoints

Q&A

  • POST /qa/ask - Ask a question about documents

Quiz Generation

  • POST /qcm/generate - Generate QCM for a topic
  • POST /qcm/white-test - Generate white test for multiple topics

Documents

  • POST /documents/upload - Upload PDF documents
  • GET /documents/list - List uploaded documents
  • DELETE /documents/delete/{filename} - Delete a document

Health

  • GET /health - Check system health and model status

See full API documentation at /docs endpoint.

Usage Examples

Ask a Question

curl -X POST "http://localhost:8001/qa/ask" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is machine learning?"}'

Generate QCM

curl -X POST "http://localhost:8001/qcm/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "topic": "Machine Learning",
    "num_questions": 5,
    "lang": "en"
  }'

Upload Document

curl -X POST "http://localhost:8001/documents/upload" \
  -F "files=@document.pdf"

Project Structure

ai_tutoring_system/
├── backend/
│   ├── app/
│   │   ├── main.py              # FastAPI app entry point
│   │   ├── config.py            # Configuration management
│   │   ├── routers/             # API route handlers
│   │   │   ├── ask.py          # Q&A endpoints
│   │   │   ├── qcm.py          # Quiz generation endpoints
│   │   │   ├── documents.py    # Document management
│   │   │   └── health.py       # Health checks
│   │   ├── schemas/            # Pydantic models
│   │   └── utils/
│   │       └── rag_pipeline.py # Core RAG logic
│   ├── data/                   # PDF documents storage
│   ├── requirements.txt        # Python dependencies
│   └── .env.example           # Environment template
├── frontend/
│   ├── core/                   # Django app
│   ├── templates/              # HTML templates
│   └── manage.py
└── README.md                   # This file

Troubleshooting

Out of Memory Errors

  • Enable 4-bit quantization: Set USE_4BIT=true in .env
  • Use CPU: Set DEVICE=cpu in .env
  • Reduce batch sizes in config.py

Model Download Issues

  • Check internet connection
  • Verify HuggingFace access (some models may require authentication)
  • Models are cached, so download happens only once

Vectorstore Build Fails

  • Ensure PDF files are valid and not corrupted
  • Check that DATA_FOLDER exists and is readable
  • Verify sufficient disk space

Performance Tips

  1. GPU Memory: Use 4-bit quantization (USE_4BIT=true) for models
  2. CPU Memory: Force embeddings to CPU (already configured)
  3. Batch Processing: Documents are processed in batches automatically
  4. Caching: Models and embeddings are cached for faster subsequent runs

Development

Running Tests

# Backend tests
cd backend
venv_backend\Scripts\activate
python -m pytest tests/ -v

# Frontend tests (when implemented)
cd frontend
venv_frontend\Scripts\activate
python manage.py test

Code Style

The project follows PEP 8 guidelines. Consider using:

  • black for code formatting
  • flake8 or pylint for linting
  • mypy for type checking

License

[Add your license here]

Contributing

[Add contributing guidelines here]

Support

For issues and questions, please open an issue on the repository.

About

AI-powered tutoring system with Q&A, quiz generation, and document management

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors