This workshop now uses:
- Zod for validated runtime configuration and payloads
- Winston for server-side logging
- Axios for HTTP calls (backend + frontend API client)
- fs-extra + csv-parse streaming to process dataset files efficiently
- SuperJSON for typed serialization of preprocess metadata
- Rolldown for Node compilation
- Vite + React 18 + TanStack Query for the UI
- Tailwind CSS v4 + DaisyUI v5 + PostCSS for styling
- Biome v2 for formatter-only workflow
- Optional
@tensorflow/tfjs-nodebackend for faster training/inference in Node - Compose includes under
deploy/compose/* - Multi-stage Docker targets under
deploy/docker/* - MinIO (S3-compatible) + init service for artifact storage bootstrap
compose.yml: root compose using includesdeploy/compose/app.yml: web + mlflow servicesdeploy/compose/ops.yml: dataset + trainer + node-client + image-dataset + image-trainerdeploy/compose/minio/*: minio service, init script, policydeploy/docker/Dockerfile: multi-stage targets (server,trainer,image-trainer,image-dataset,predict)src/*: backend, training, model IO, inference, configweb/*: React app
This is a step-by-step recipe.
Think of it like building a toy castle: first floor, then walls, then roof 🏰
- Docker + Docker Compose
- Node.js 20+
justcommand runner
Check quickly:
docker --version
docker compose version
node --version
just --versionIf one command fails, install that tool first.
cp .env.example .envWhy: .env contains ports and credentials used by MLflow + MinIO + app services.
just initWhy: this creates shared/data, shared/model, shared/image-data, shared/image-model.
just buildWhy: builds all service images once so later commands are faster.
just workshopWhat happens inside:
- download Iris CSV
- train numeric model (4 features → 3 species)
- start web server
- MLflow:
http://localhost:5000(or yourMLFLOW_PORT) - Web UI:
http://localhost:3000(or yourWEB_PORT) - MinIO Console:
http://localhost:9001(or yourMINIO_CONSOLE_PORT)
just workshop-imageWhat this does:
- downloads public flower photos dataset
- trains image model (5 classes)
- starts web server with image endpoint ready
- Numeric tab: enter sepal/petal numbers, click Predict
- Image tab: upload flower photo, click Predict Image
If image model is not ready, the UI badge tells you.
just up # start app stack
just dataset # fetch iris CSV only
just train # train numeric model only
just image-dataset # fetch flower photos only
just image-train # train image model only
just logs trainer # watch trainer logs
just down # stop everything- Rebuild:
just build - Restart web:
just web-up - Clear image artifacts + dataset:
just clean-image - Clear numeric model artifacts:
just clean-model
And yes, if your laptop fan sounds like a tiny helicopter 🚁, that means it is doing science. No panic. It is not trying to fly away (probably).
trainersaves model files toshared/model/and uploads copies to MinIO underruns/<runId>/model/.- MLflow artifacts are configured to use
s3://$S3_BUCKETvia MinIO endpoint. - Image upload prediction endpoint is
POST /api/predict-image(multipart form field:image). - Image model artifacts are saved in
shared/image-model/(model.json,weights.bin,image-meta.json).
- Use
train_loss+val_loss(ortrain_accuracy+val_accuracy) for learning curves. - For denser curves, use
train_loss_batch+train_accuracy_batch(logged every 10 batches). - Use
final_validation_loss+final_validation_accuracyas one-point summary metrics after training. - If a chart looks like a straight line, check run params first:
epochs=1means only one epoch point. - On very slow machines, training may look "stuck" between epoch logs; native
tfjs-nodebackend is auto-enabled when available.
Default image training params are tuned for workshop speed:
IMAGE_SIZE=64IMAGE_MAX_SAMPLES_PER_CLASS=80IMAGE_EPOCHS=8
- TensorFlow
tf_flowersdataset (3,670 training images, 5 classes):
dandelion,daisy,tulips,sunflowers,roses - Download URL used by this project:
https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz - Reference: