Skip to content

feat: fine-tuning wizard — 5-step end-to-end flow#33

Merged
SahilKumar75 merged 1 commit into
mainfrom
feat/finetune-wizard
May 31, 2026
Merged

feat: fine-tuning wizard — 5-step end-to-end flow#33
SahilKumar75 merged 1 commit into
mainfrom
feat/finetune-wizard

Conversation

@SahilKumar75
Copy link
Copy Markdown
Owner

@SahilKumar75 SahilKumar75 commented May 31, 2026

Summary

This PR implements the core TuneOS fine-tuning experience as a dedicated /finetune page — a guided 5-step wizard that takes a non-technical user from picking a model all the way to a testable, downloadable, shareable fine-tuned adapter.

What's included

Trainer backend

  • trainer/evaluate.py — Real perplexity calculation (replaces None stub). Runs on a 20% held-out sample of the training data, no extra data required from the user.
  • trainer/finetune.py — Returns (output_path, model, tokenizer) so the worker can pass directly to eval without reloading.

Worker

  • workers/train_task.py — Runs eval automatically after training, publishes result to Redis job:{id}:eval. Eval failure is caught and does not fail the job.

Four new REST endpoints (app/api.py)

Endpoint What it does
GET /api/jobs/{id}/download Streams adapter weights as a .zip
POST /api/jobs/{id}/push_hub Pushes adapter to HF Hub (private repo)
GET /api/jobs/{id}/eval Reads perplexity from Redis
POST /api/jobs/{id}/infer Local inference with lazy in-process model cache

Wizard UI (app/pages/finetune.py + app/state/finetune_state.py)

Step What the user sees
1 — Model 4 model cards + technique selector. QLoRA/LoRA active. Full fine-tune/DPO show "Coming soon" stub.
2 — Dataset Upload dropzone + reuse existing datasets + 5-row preview with column validation
3 — Configure LoRA rank/alpha sliders + training params grid. Beginner-friendly labels. Advanced settings in collapsed accordion.
4 — Training Live loss chart + log stream + stop button. Auto-advances to results on completion.
5 — Results 2×2 grid: download adapter, push to HF Hub, perplexity eval, inline test chat

Navigation

  • /finetune route registered in app/app.py
  • "Fine-tune" nav item wired in both expanded and collapsed sidebar states

Design decisions

  • New FinetuneState — isolated from existing ModelState to avoid breaking the /configure/training flow
  • Perplexity only, no BLEU — instruction-following datasets have no reference outputs; a BLEU score would be misleading
  • Inference runs in FastAPI process — request/response pattern fits better than Celery for interactive chat; lazy cache evicts previous model on new job
  • Coming-soon stubs — Full fine-tune and DPO show a locked button + tooltip; no training logic added for them

Closes

Test plan

  • Open /finetune, verify 5 step dots render and progress bar advances
  • Step 1: click each model card, verify selection highlight; click "Coming soon" buttons, verify they do nothing
  • Step 2: upload a .jsonl with instruction/output columns, verify preview table; upload a bad file, verify error callout
  • Step 3: adjust sliders, verify value readout updates live
  • Step 4: start a training job (requires Redis + Celery worker), verify loss chart updates and log stream scrolls
  • Step 5: verify download link, HF Hub push with a valid token, perplexity card populates, test chat returns a response
  • Sidebar: click "Fine-tune" from home, confirm redirect to /finetune
  • curl GET /api/jobs/{id}/download — unzip, confirm adapter_config.json present
  • curl GET /api/jobs/{id}/eval — confirm {"status":"done","perplexity":...}

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features

  • Multi-step Fine-tuning Wizard – Guided workflow for model selection, dataset upload with validation, training configuration, and progress monitoring.
  • Post-Training Capabilities – Download adapter weights, push trained models to Hugging Face Hub, and view evaluation metrics (perplexity).
  • Model Testing Interface – Test fine-tuned adapters via chat-based inference directly in the application.
  • Enhanced Navigation – Updated sidebar now links to the new fine-tuning page.

Implements the core TuneOS fine-tuning experience as a dedicated
/finetune wizard: model selection, dataset upload+validation,
hyperparameter config, live training progress, and a results page
with download, HF Hub push, perplexity eval, and inline model chat.

Closes #8   — LoRA fine-tuning configuration workspace
Closes #9   — QLoRA and advanced PEFT presets
Closes #18  — evaluate_model() returns placeholder None values

Trainer:
- trainer/evaluate.py: implement perplexity on 20% held-out sample
- trainer/finetune.py: return (output_path, model, tokenizer) for eval

Worker:
- workers/train_task.py: run eval post-training, publish to Redis
  job:{id}:eval so results are available without re-loading the model

API (app/api.py):
- GET  /api/jobs/{id}/download  — stream adapter weights as zip
- POST /api/jobs/{id}/push_hub  — push adapter to HF Hub
- GET  /api/jobs/{id}/eval      — read perplexity from Redis
- POST /api/jobs/{id}/infer     — local inference with lazy model cache

State (app/state/finetune_state.py):
- New FinetuneState owning all wizard fields, events, computed vars
- Isolated from existing ModelState to avoid breaking /configure flow

UI (app/pages/finetune.py):
- Step 1: model cards + technique selector (QLoRA/LoRA active,
  Full fine-tune/DPO as "Coming soon" stubs)
- Step 2: upload dropzone + reuse existing datasets + preview table
  with column validation
- Step 3: LoRA sliders + training params grid with beginner tooltips
- Step 4: live loss chart (reuses loss_chart component) + log stream
  + stop button + auto-advance on completion
- Step 5: 2×2 results grid (download / push / eval / test chat)

Navigation:
- app/app.py: /finetune route registered
- sidebar.py: "Fine-tune" nav item wired in expanded + collapsed states

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 31, 2026

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR delivers a complete LoRA fine-tuning workspace. It implements real model evaluation, adds post-training REST endpoints for adapter download and Hugging Face Hub integration, introduces a five-step UI wizard for job configuration and monitoring, and provides state management for the entire workflow.

Changes

Fine-tuning Workflow Implementation

Layer / File(s) Summary
Model Evaluation and Training Integration
trainer/evaluate.py, trainer/finetune.py, workers/train_task.py
evaluate_model() now computes perplexity from test loss and token counts instead of returning placeholder None values. finetune() returns (output_path, model, tokenizer) tuple. Training task captures the model/tokenizer and evaluates the fine-tuned adapter, storing perplexity results under job:{job_id}:eval in Redis.
Post-training REST API
app/api.py
Adds four endpoints under /api/jobs/{job_id}: GET /download streams adapter weights as ZIP, POST /push_hub uploads adapter to Hugging Face Hub with user-provided token, GET /eval retrieves stored evaluation metrics, and POST /infer loads the adapter and runs inference with optional caching. New request schemas PushHubRequest and InferRequest validate inputs.
Fine-tuning State Management
app/state/finetune_state.py
FinetuneState manages step progression, model/technique selection, dataset upload with validation (required instruction and output columns), LoRA hyperparameters (rank, alpha, epochs, learning rate, batch size), and post-training actions. Provides dataset loading, file upload/validation, training orchestration (enqueueing to Celery), and background polling for Hub push, evaluation completion, and inference responses.
Multi-step Fine-tuning Wizard UI
app/pages/finetune.py
Implements a five-step wizard: (1) model and training technique selection with disabled "coming soon" options, (2) dataset upload dropzone and optional reuse selector with preview table and validation feedback, (3) LoRA configuration sliders and parameter inputs, (4) live training progress with loss chart and log stream, and (5) results view with download adapter, Hub push status/feedback, evaluation metrics display, and test-chat inference. Progress bar and step routing driven by state.
Page Registration and Navigation
app/app.py, app/components/sidebar.py
Registers /finetune page route with "Fine-tune — TuneOS" title. Sidebar "Fine-tune" link (both expanded and collapsed modes) now navigates to the new workspace.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A rabbit hops through five bright steps,
First models picked, then data kept,
LoRA knobs to turn and twist,
Training watched—no task is missed,
Results bloom: download, push, and chat! 🎓✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.07% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: fine-tuning wizard — 5-step end-to-end flow' clearly describes the main change: a new 5-step wizard interface for fine-tuning.
Description check ✅ Passed The PR description covers all key sections: Summary, What's included (trainer, worker, endpoints, UI), Design decisions, linked issues, and test plan. It matches the repository template structure well.
Linked Issues check ✅ Passed All code changes meet the requirements from linked issues #8, #9, and #18: LoRA configuration controls with sliders [#8], QLoRA technique selection with safe defaults [#9], and real perplexity evaluation replacing None placeholder [#18].
Out of Scope Changes check ✅ Passed All changes align with PR objectives: trainer evaluation, REST endpoints for adapter management, and the 5-step wizard UI. No unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/finetune-wizard

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

evaluate_model() returns placeholder None values Add QLoRA and advanced PEFT presets Implement LoRA fine-tuning configuration workspace

1 participant