feat: fine-tuning wizard — 5-step end-to-end flow by SahilKumar75 · Pull Request #33 · SahilKumar75/TuneOS

SahilKumar75 · 2026-05-31T07:48:34Z

Summary

This PR implements the core TuneOS fine-tuning experience as a dedicated /finetune page — a guided 5-step wizard that takes a non-technical user from picking a model all the way to a testable, downloadable, shareable fine-tuned adapter.

What's included

Trainer backend

trainer/evaluate.py — Real perplexity calculation (replaces None stub). Runs on a 20% held-out sample of the training data, no extra data required from the user.
trainer/finetune.py — Returns (output_path, model, tokenizer) so the worker can pass directly to eval without reloading.

Worker

workers/train_task.py — Runs eval automatically after training, publishes result to Redis job:{id}:eval. Eval failure is caught and does not fail the job.

Four new REST endpoints (app/api.py)

Endpoint	What it does
`GET /api/jobs/{id}/download`	Streams adapter weights as a `.zip`
`POST /api/jobs/{id}/push_hub`	Pushes adapter to HF Hub (private repo)
`GET /api/jobs/{id}/eval`	Reads perplexity from Redis
`POST /api/jobs/{id}/infer`	Local inference with lazy in-process model cache

Wizard UI (app/pages/finetune.py + app/state/finetune_state.py)

Step	What the user sees
1 — Model	4 model cards + technique selector. QLoRA/LoRA active. Full fine-tune/DPO show "Coming soon" stub.
2 — Dataset	Upload dropzone + reuse existing datasets + 5-row preview with column validation
3 — Configure	LoRA rank/alpha sliders + training params grid. Beginner-friendly labels. Advanced settings in collapsed accordion.
4 — Training	Live loss chart + log stream + stop button. Auto-advances to results on completion.
5 — Results	2×2 grid: download adapter, push to HF Hub, perplexity eval, inline test chat

Navigation

/finetune route registered in app/app.py
"Fine-tune" nav item wired in both expanded and collapsed sidebar states

Design decisions

New FinetuneState — isolated from existing ModelState to avoid breaking the /configure → /training flow
Perplexity only, no BLEU — instruction-following datasets have no reference outputs; a BLEU score would be misleading
Inference runs in FastAPI process — request/response pattern fits better than Celery for interactive chat; lazy cache evicts previous model on new job
Coming-soon stubs — Full fine-tune and DPO show a locked button + tooltip; no training logic added for them

Closes

Closes Implement LoRA fine-tuning configuration workspace #8 — Implement LoRA fine-tuning configuration workspace
Closes Add QLoRA and advanced PEFT presets #9 — Add QLoRA and advanced PEFT presets
Closes evaluate_model() returns placeholder None values #18 — evaluate_model() returns placeholder None values

Test plan

Open /finetune, verify 5 step dots render and progress bar advances
Step 1: click each model card, verify selection highlight; click "Coming soon" buttons, verify they do nothing
Step 2: upload a .jsonl with instruction/output columns, verify preview table; upload a bad file, verify error callout
Step 3: adjust sliders, verify value readout updates live
Step 4: start a training job (requires Redis + Celery worker), verify loss chart updates and log stream scrolls
Step 5: verify download link, HF Hub push with a valid token, perplexity card populates, test chat returns a response
Sidebar: click "Fine-tune" from home, confirm redirect to /finetune
curl GET /api/jobs/{id}/download — unzip, confirm adapter_config.json present
curl GET /api/jobs/{id}/eval — confirm {"status":"done","perplexity":...}

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features

Multi-step Fine-tuning Wizard – Guided workflow for model selection, dataset upload with validation, training configuration, and progress monitoring.
Post-Training Capabilities – Download adapter weights, push trained models to Hugging Face Hub, and view evaluation metrics (perplexity).
Model Testing Interface – Test fine-tuned adapters via chat-based inference directly in the application.
Enhanced Navigation – Updated sidebar now links to the new fine-tuning page.

Implements the core TuneOS fine-tuning experience as a dedicated /finetune wizard: model selection, dataset upload+validation, hyperparameter config, live training progress, and a results page with download, HF Hub push, perplexity eval, and inline model chat. Closes #8 — LoRA fine-tuning configuration workspace Closes #9 — QLoRA and advanced PEFT presets Closes #18 — evaluate_model() returns placeholder None values Trainer: - trainer/evaluate.py: implement perplexity on 20% held-out sample - trainer/finetune.py: return (output_path, model, tokenizer) for eval Worker: - workers/train_task.py: run eval post-training, publish to Redis job:{id}:eval so results are available without re-loading the model API (app/api.py): - GET /api/jobs/{id}/download — stream adapter weights as zip - POST /api/jobs/{id}/push_hub — push adapter to HF Hub - GET /api/jobs/{id}/eval — read perplexity from Redis - POST /api/jobs/{id}/infer — local inference with lazy model cache State (app/state/finetune_state.py): - New FinetuneState owning all wizard fields, events, computed vars - Isolated from existing ModelState to avoid breaking /configure flow UI (app/pages/finetune.py): - Step 1: model cards + technique selector (QLoRA/LoRA active, Full fine-tune/DPO as "Coming soon" stubs) - Step 2: upload dropzone + reuse existing datasets + preview table with column validation - Step 3: LoRA sliders + training params grid with beginner tooltips - Step 4: live loss chart (reuses loss_chart component) + log stream + stop button + auto-advance on completion - Step 5: 2×2 results grid (download / push / eval / test chat) Navigation: - app/app.py: /finetune route registered - sidebar.py: "Fine-tune" nav item wired in expanded + collapsed states Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-05-31T07:48:45Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR delivers a complete LoRA fine-tuning workspace. It implements real model evaluation, adds post-training REST endpoints for adapter download and Hugging Face Hub integration, introduces a five-step UI wizard for job configuration and monitoring, and provides state management for the entire workflow.

Changes

Fine-tuning Workflow Implementation

Layer / File(s)	Summary
Model Evaluation and Training Integration `trainer/evaluate.py`, `trainer/finetune.py`, `workers/train_task.py`	`evaluate_model()` now computes perplexity from test loss and token counts instead of returning placeholder `None` values. `finetune()` returns `(output_path, model, tokenizer)` tuple. Training task captures the model/tokenizer and evaluates the fine-tuned adapter, storing perplexity results under `job:{job_id}:eval` in Redis.
Post-training REST API `app/api.py`	Adds four endpoints under `/api/jobs/{job_id}`: `GET /download` streams adapter weights as ZIP, `POST /push_hub` uploads adapter to Hugging Face Hub with user-provided token, `GET /eval` retrieves stored evaluation metrics, and `POST /infer` loads the adapter and runs inference with optional caching. New request schemas `PushHubRequest` and `InferRequest` validate inputs.
Fine-tuning State Management `app/state/finetune_state.py`	`FinetuneState` manages step progression, model/technique selection, dataset upload with validation (required `instruction` and `output` columns), LoRA hyperparameters (rank, alpha, epochs, learning rate, batch size), and post-training actions. Provides dataset loading, file upload/validation, training orchestration (enqueueing to Celery), and background polling for Hub push, evaluation completion, and inference responses.
Multi-step Fine-tuning Wizard UI `app/pages/finetune.py`	Implements a five-step wizard: (1) model and training technique selection with disabled "coming soon" options, (2) dataset upload dropzone and optional reuse selector with preview table and validation feedback, (3) LoRA configuration sliders and parameter inputs, (4) live training progress with loss chart and log stream, and (5) results view with download adapter, Hub push status/feedback, evaluation metrics display, and test-chat inference. Progress bar and step routing driven by state.
Page Registration and Navigation `app/app.py`, `app/components/sidebar.py`	Registers `/finetune` page route with "Fine-tune — TuneOS" title. Sidebar "Fine-tune" link (both expanded and collapsed modes) now navigates to the new workspace.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A rabbit hops through five bright steps,
First models picked, then data kept,
LoRA knobs to turn and twist,
Training watched—no task is missed,
Results bloom: download, push, and chat! 🎓✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.07% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: fine-tuning wizard — 5-step end-to-end flow' clearly describes the main change: a new 5-step wizard interface for fine-tuning.
Description check	✅ Passed	The PR description covers all key sections: Summary, What's included (trainer, worker, endpoints, UI), Design decisions, linked issues, and test plan. It matches the repository template structure well.
Linked Issues check	✅ Passed	All code changes meet the requirements from linked issues `#8`, `#9`, and `#18`: LoRA configuration controls with sliders [`#8`], QLoRA technique selection with safe defaults [`#9`], and real perplexity evaluation replacing None placeholder [`#18`].
Out of Scope Changes check	✅ Passed	All changes align with PR objectives: trainer evaluation, REST endpoints for adapter management, and the 5-step wizard UI. No unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/finetune-wizard

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

This was referenced May 31, 2026

Implement LoRA fine-tuning configuration workspace #8

Closed

Add QLoRA and advanced PEFT presets #9

Closed

evaluate_model() returns placeholder None values #18

Closed

feat: wire MPS (Apple Silicon) device into fine-tuning wizard #34

Open

SahilKumar75 merged commit 3c97a64 into main May 31, 2026
1 of 5 checks passed

SahilKumar75 deleted the feat/finetune-wizard branch May 31, 2026 07:57

coderabbitai Bot mentioned this pull request Jun 1, 2026

feat: 7-step wizard, experiment tracking, deploy tab, API package refactor #44

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: fine-tuning wizard — 5-step end-to-end flow#33

feat: fine-tuning wizard — 5-step end-to-end flow#33
SahilKumar75 merged 1 commit into
mainfrom
feat/finetune-wizard

SahilKumar75 commented May 31, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 31, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SahilKumar75 commented May 31, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's included

Design decisions

Closes

Test plan

Summary by CodeRabbit

New Features

Uh oh!

coderabbitai Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SahilKumar75 commented May 31, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 31, 2026 •

edited

Loading