Project 20 for the Distributed Network Programming course.
A high-performance, autoscaling microservice deployed on Kubernetes that accepts image uploads, performs classification using a pre-trained Deep Learning model, and exposes Prometheus metrics for monitoring.
- API Framework (
FastAPI): Chosen for its high performance, native async support, and automatic OpenAPI documentation. Heavy ML inference is offloaded to a separate threadpool to prevent event loop blocking. - Machine Learning (
PyTorch&ResNet18): ResNet18 offers the optimal balance between inference speed and accuracy, making it ideal for a responsive microservice. - Package Management (
uv): Used for extremely fast dependency resolution and strict lockfile generation. - Automation (
go-task): Used instead of complex Bash scripts. Features built-in smart K8s context detection (automatically loads local images intominikubeorkindwithout needing a local registry). - Autoscaling (
HPA): Horizontal Pod Autoscaler monitors CPU utilization and dynamically scales the classification pods under load.
To ensure a smooth deployment, you must have the following installed on your host machine:
- Python 3.10+ (Required to parse deployment scripts and run Locust).
- A local Kubernetes Cluster (Docker Desktop, Minikube, or Kind).
⚠️ MINIKUBE USERS - CRITICAL STEP: The Autoscaler (HPA) requires metrics to function. You must enable the metrics server before deploying:minikube addons enable metrics-server
3. Install go-task (The Task Runner)
This project uses task to automate everything. You must install it first:
# Windows (via built-in winget)
winget install Task.Task
# macOS (Homebrew)
brew install go-task/tap/go-task
# Linux (Installs globally to /usr/local/bin)
sudo sh -c "$(curl --location https://taskfile.dev/install.sh)" -- -d -b /usr/local/bin(Note: The task CLI will automatically verify if you have uv, docker, kubectl, and helm installed when you run it).
Deploy the entire architecture (App Build, K8s Deployment, Helm Monitoring Stack) with a single command:
task startWhat this does automatically:
- Verifies system requirements.
- Calculates source code hash to generate a unique Docker image tag.
- Builds the image and intelligently loads it into your specific K8s environment.
- Installs the
kube-prometheus-stackvia Helm into themonitoringnamespace. - Applies Kubernetes manifests, waits for the rollout, and prints the Grafana admin password.
Once deployed, the FastAPI Swagger UI is available at: 👉 http://localhost/docs
⚠️ MINIKUBE USERS: K8s LoadBalancers do not expose tolocalhostautomatically in Minikube. You must run the following in a separate terminal and keep it open:minikube tunnel
To validate throughput, latency, and HPA behavior, use the built-in test suite. This will set up the necessary port-forwards to Grafana and start the Locust load testing tool.
task test(You can override default load parameters: task test USERS=50 RATE=5)
- Start the Load (Locust):
Open http://localhost:8089. The test will automatically start hitting the
/predictendpoint with random image data. Observe RPS and Latency here. - Watch the Autoscaler (HPA):
Open a new terminal and watch Kubernetes spawn new pods as the CPU load increases:
kubectl get hpa -w kubectl get pods -w
- View Prometheus Metrics (Grafana):
Open http://localhost:3000. Login with username
admin(use the password printed at the end oftask start).- Navigate to: Dashboards -> Kubernetes / Compute Resources / Namespace (Workloads).
You can run and test the application locally without full cluster deployment:
task run # Starts the FastAPI server locally on port 4123
task lint # Runs Ruff to format and lint codeManage your cluster state easily with these commands to free up resources:
# Remove Application only (keeps monitoring stack and Grafana data)
task down
# NUCLEAR Clean: Delete everything (App, Monitoring Stack, Namespace)
task teardown
# Clean local caches (Python __pycache__, Ruff, UV, Docker dangling images)
task clean