This project implements an anomaly detection model to identify students at risk of failure. It follows MLOps best practices including experimentation tracking, model versioning, and automated pipelines.
- Anomaly Detection: Isolation Forest model to detect unusual student behavior.
- Risk Assessment: Hybrid scoring combining ML anomaly scores with rule-based heuristics.
- Experiment Tracking: Weights & Biases (W&B) integration for metric logging.
- Model Registry: Automatic versioning of trained models using W&B Artifacts.
- CI/CD: GitHub Actions pipeline for automated testing and checks.
- API: Flask-based REST API for real-time and batch predictions.
-
Clone the repository
git clone <repo-url> cd anomalydetectionmodel
-
Install dependencies
pip install -r requirements.txt
-
Set up Weights & Biases
- Create an account at wandb.ai
- Login locally:
wandb login
To train the model and log experiments to W&B:
python train_model.pyThis will:
- Load data from
data/ - Train the Isolation Forest model
- Log metrics (F1 score, contamination) to W&B
- Save the model to
models/and upload it to W&B Artifacts
Start the Flask API:
python app.pyEndpoints:
POST /predict: Predict risk for a single studentPOST /predict_batch: Predict risk for a batch of studentsGET /diagnose: Run diagnostic tests
The project uses GitHub Actions for CI/CD:
- Linting:
flake8checks for code style issues. - Testing:
pytestruns unit tests to ensure model integrity. - Trigger: Runs on every push to
mainand pull requests.