Intelligent Mathematical Problem Solving Pipeline From whiteboard image β OCR β AI normalization β Wolfram evaluation β spoken result (Pepper robot)
Math Robot AI is a distributed system that:
- Captures a whiteboard image (Pepper robot or API upload)
- Detects mathematical expressions
- Converts them to LaTeX (Pix2Text OCR)
- Cleans and normalizes LaTeX using LLM (Ollama β Qwen2.5 3B )
- Converts to Wolfram syntax
- Evaluates using Wolfram Kernel (via proxy)
- Returns structured results
- Generates HTML output
- Speaks the result via Pepper robot
.
βββ math-robot-api/ # Main FastAPI backend
β βββ app/
β β βββ controllers/ # API endpoints
β β βββ services/ # Core business logic
β β βββ schemas/ # Pydantic models
β β βββ models/ # Internal domain models
β β βββ middlewares/ # Logging middleware
β β βββ config.py
β β βββ main.py
β βββ Dockerfile
β βββ requirements.txt
β
βββ math-robot-client/ # Pepper robot client
β βββ main.py
β βββ config.py
β
βββ wolfram-proxy/ # Wolfram evaluation service
β βββ main.py
β βββ requirments.txt
β
βββ infrastructure/
β βββ docker-compose.yml
β βββ docker-compose-school.yml
β βββ example.env
β
βββ yolo_data/
βββ best.pt # YOLO model for problem detection
ollama list
ollama pull qwen2.5:3bThis downloads the Qwen2.5 3B model. Run this before first use or if the model is missing.
git clone <repository-url>
cd math-robot-apicd infrastructure
cp example.env .envEdit .env if needed.
docker-compose up -ddocker-compose -f docker-compose-school.yml up -dSchool mode includes:
- Full pipeline services
- Preconfigured classroom setup
β The Wolfram Proxy must be started manually in a separate terminal.
cd wolfram-proxypython3 -m venv venvLinux / macOS
source venv/bin/activatepip install -r requirements.txtpython main.pyIf successful, you should see:
Running on http://0.0.0.0:8010
β Make sure Wolfram Engine is installed and the path matches:
WolframLanguageSession("/usr/local/bin/WolframKernel")First startup may take several minutes because:
- Pix2Text model initializes
- Ollama model (Qwen2.5 3B) loads
- YOLO weights are loaded
- Wolfram session initializes
| Service | Port | Description |
|---|---|---|
| math-robot-api | 8000 | Main FastAPI backend |
| wolfram-proxy | 8010 | Wolfram evaluation service |
API docs available at:
http://localhost:8000/docs
API uses Basic Authentication.
Default (example):
username: test
password: test
β Change credentials in production.
Pepper client sends:
Authorization: Basic base64("test:test")The PipelineService orchestrates:
- YOLO model detects problem regions
- Extracts individual problem images
- Pix2Text converts image β LaTeX
- Ollama (Qwen2.5 3B)
- Fixes syntax
- Normalizes structure
- Converts to Wolfram syntax
- Sends to
wolfram-proxy - Evaluates via Wolfram Kernel
- LLM cleans output
- Removes unnecessary formatting
After pipeline execution:
HtmlService.save_problem(...)Generates:
- Structured HTML file
- Saved in public directory
- Accessible via:
/public/index.html
This allows:
- Viewing results on tablet
- Shows last solved problem
- Prints helpful info for debug
Located in:
math-robot-client/
- Waits for head touch
- Captures camera image
- Sends image via multipart/form-data
- Receives result
- Speaks solution
- Displays HTML on tablet
Located in:
wolfram-proxy/
Lightweight Flask service that:
- Maintains persistent
WolframLanguageSession - Evaluates Wolfram code
- Exposes:
GET /eval?code=...
GET /health
Required for full pipeline functionality.
target_regionsβ expected number of expressions (1β20)fileβ whiteboard image (multipart/form-data)
{
"total_problems": 1,
"successful": 1,
"failed": 0,
"results": [
{
"problem_id": 1,
"latex_raw": "...",
"latex_filtered": "...",
"result_wolfram": "...",
"result_filtered": "...",
"success": true
}
],
"processing_time": 3.42
}- YOLO model file must exist:
yolo_data/best.pt
- Wolfram Kernel path must match:
WolframLanguageSession("/usr/local/bin/WolframKernel")- For school demo β always use
docker-compose-school.yml