Skip to content

Bilingual assistant for Moroccan administrative procedures, focused on steps, documents, costs, and timelines. (FastAPI + React)

Notifications You must be signed in to change notification settings

drawliin/Chat-Bot-Administration

Repository files navigation

Chatbot-mso

Full-stack RAG chatbot for answering questions about administrative procedures (e.g. civil documents, ID cards, passports). It uses a FastAPI backend, a Vite + React frontend, Hugging Face (or OpenAI) for LLM calls, and a local vector DB for retrieval. Optional OCR (Tesseract) is available for image uploads.


Features

  • Multilingual Q&A — French and Arabic (and other languages the LLM supports)
  • Retrieval-augmented generation (RAG) — Local vector store + optional reranking
  • Dual LLM backends — Hugging Face Router (default) or OpenAI
  • Image OCR — Upload a document image; text is extracted with Tesseract and can be used in the conversation
  • Docker-based — Dev and production setups via Docker Compose

Project structure

Chatbot-mso/
├── backend/                 # FastAPI app (LLM, RAG, OCR)
│   ├── app/
│   │   ├── core/            # Prompts, config
│   │   └── services/        # LLM, OCR, vector retrieval
│   ├── main.py
│   ├── Dockerfile
│   └── Dockerfile.dev
├── frontend/                # Vite + React + Tailwind
│   ├── src/
│   ├── Dockerfile
│   └── Dockerfile.dev
├── data-pipeline/           # Builds vector DB from source data
├── data/
│   └── vectordb/            # Vector DB (created or downloaded)
├── deployment/
│   └── scripts/             # start.sh, stop.sh (production)
├── docker-compose-dev.yml   # Development (hot reload)
├── docker-compose-prod.yml  # Production
├── run.sh                   # Dev: build vectordb if needed, then docker-compose-dev up
└── .env / .env.example

Prerequisites

  • Docker and Docker Compose (v2)
  • (Optional) Prebuilt vector DB — speeds up first run; see Vector DB below

Quick start (development)

1. Clone and env

git clone <your-repo-url>
cd Chatbot-mso
cp .env.example .env

Edit .env and set at least:

  • Hugging Face (default): HF_TOKEN=your-hf-token
  • Or OpenAI: OPENAI_API_KEY=your-openai-key and switch backend in backend/main.py (see Choose LLM backend).

2. Vector DB

You need a populated data/vectordb directory. Either:

  • Download prebuilt (recommended):
    Prebuilt vectordb (MediaFire)
    Unzip and put the contents directly in data/vectordb.
  • Or run without it: run.sh will build the vector DB via the data-pipeline (slower, first time only).

3. Start dev stack

From the repo root:

./run.sh

This will:

  • Build the vector DB with the data-pipeline image if data/vectordb is missing or empty
  • Start backend and frontend with docker-compose-dev.yml (hot reload)

Ports:

Service URL
Frontend http://localhost:5173
Backend http://localhost:8000

Stop:

docker compose -f docker-compose-dev.yml down

Development workflow

Using Docker (recommended)

  • Start: ./run.sh (or docker compose -f docker-compose-dev.yml up -d if vectordb already exists)
  • Logs: docker compose -f docker-compose-dev.yml logs -f
  • Rebuild after dependency changes:
    docker compose -f docker-compose-dev.yml up -d --build

Code in backend/ and frontend/src/ is mounted into the containers, so edits are reflected without rebuilding (hot reload).

Running backend or frontend locally (no Docker)

  • Backend: Python 3.12, install deps from backend/requirements.txt and requirements.base.txt, set VECTORDB_PATH to ./data/vectordb, then run uvicorn from backend/ (e.g. uvicorn main:app --reload).
  • Frontend: From frontend/ run npm install && npm run dev. The app expects the API at http://localhost:8000 (see frontend/src/services/api.js).

Choose LLM backend

Default: Hugging Face Router via backend/app/services/llm_service_huggingface.py.

Switch to OpenAI:

  1. Add to .env:
    OPENAI_API_KEY=your-openai-key
  2. In backend/main.py, change the import to:
    from app.services.llm_service_openai import LLMService, SpecializedLLM
    Comment out or remove the llm_service_huggingface import.

To switch back to Hugging Face, use: from app.services.llm_service_huggingface import LLMService, SpecializedLLM.


Configuration (.env)

Variable Description
HF_TOKEN Hugging Face API token (required for HF backend)
OPENAI_API_KEY OpenAI API key (required for OpenAI backend)
HF_MODEL / OPENAI_MODEL Model name
HF_TEMPERATURE / OPENAI_TEMPERATURE Sampling temperature
RETRIEVAL_K Number of chunks to retrieve
RETRIEVAL_RERANK_TOP_N Top N after reranking
RERANK_ENABLED 1 or 0
MAX_CONTEXT_CHARS Max characters of context sent to LLM
MAX_QUESTION_LENGTH Max question length

See .env.example for the full list and defaults.


Vector DB

  • Prebuilt: Download from the link above, unzip into data/vectordb.
  • Build from scratch: Run ./run.sh with an empty (or missing) data/vectordb; the data-pipeline container will index and populate it. This can take a long time (e.g. hours) depending on data size.

Production

  • Start: ./deployment/scripts/start.sh (uses docker-compose-prod.yml)
  • Stop: ./deployment/scripts/stop.sh

Production frontend is served on port 8080; backend remains on 8000.


Sample questions

You can try questions like:

  • Comment obtenir un extrait d'acte de naissance ?
  • كيف أحصل على نسخة من رسم الولادة؟
  • ما هي الوثائق المطلوبة لتجديد البطاقة الوطنية؟
  • نسخة من السجل العدلي بالنسبة للمغاربة المقيمين بالخارج
  • الحصول على جواز السفر البيومتري بالنسبة للقاصرين أقل من 12 سنة بالمغرب
  • تجديد البطاقة الوطنية للتعريف

About

Bilingual assistant for Moroccan administrative procedures, focused on steps, documents, costs, and timelines. (FastAPI + React)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •