feat: 7-step wizard, experiment tracking, deploy tab, API package refactor by SahilKumar75 · Pull Request #44 · SahilKumar75/TuneOS

SahilKumar75 · 2026-06-01T16:51:29Z

Summary

7-step fine-tuning wizard — model selection → intent → data → configure → live training → results/eval → deploy
Experiment tracking — SQLite-backed ExperimentState persists every run with full loss history, perplexity, and metadata
Deploy tab — adapter download, HF Hub push, full model merge + GGUF export, GitHub push (all as Celery background tasks)
API package refactor — split the 783-line app/api.py monolith into app/api/ with 7 domain-scoped modules
Crash fix — rx.Base was removed in current Reflex; replaced with pydantic.BaseModel
DB path fix — experiment SQLite DB uses absolute path so it works regardless of working directory
SitemapPlugin — suppress startup warning by disabling the plugin in rxconfig.py
Finetune step components — app/components/finetune/ step1–4 extracted as reusable components

Closes

Closes Build backend services for metadata, datasets, and training orchestration #13 (backend API)
Closes Implement durable project and state management #12 (experiment tracking)
Closes Build chat-assisted model setup workflow #11 (wizard workflow)

Test plan

poetry run python -c "from app.app import app; print('OK')" — import clean
poetry run reflex run — starts without traceback or SitemapPlugin warning
Navigate to /finetune — 7-step wizard opens as workspace tab
Upload a dataset — preview appears, validation error shown for bad files
Start a training job — loss chart updates live, epoch log fills in
After training — results/eval tab shows perplexity, test chat works
Deploy tab — adapter download link present, HF Hub push form shows

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- 7-step fine-tuning wizard with visual progress indicators
- Experiment tracking and browsing with persistent history
- Dataset discovery, preview, and synthetic data generation
- Multi-target model deployment (HF Hub, GGUF export, GitHub, adapters)
- Real-time training dashboard with eval loss and learning rate metrics
- Chat testing interface for fine-tuned models
- Support for custom and local model uploads

…efactor ## New features - 7-step fine-tuning wizard (model → intent → data → configure → train → results → deploy) - Experiment tracking with SQLite persistence (app/state/experiment_state.py) - Deploy tab: adapter download, HF Hub push, merged model, GGUF export, GitHub push - Merge + GGUF export workers (trainer/merge.py, workers/merge_task.py) - Synthetic dataset generation and HuggingFace Hub dataset preview in wizard step 3 - AI commentary on training epochs via /api/jobs/{id}/commentary ## Refactoring - Split 783-line app/api.py monolith into app/api/ package (schemas, deps, system, models, datasets, jobs, experiments) - Add app/components/finetune/ step components for wizard steps 1–4 - Fix rx.Base crash: ExperimentRun now uses pydantic.BaseModel (rx.Base removed in current Reflex) - Fix experiment DB path: absolute path from __file__ instead of relative ./storage - Fix rxconfig.py SitemapPlugin startup warning ## Docs - README: add "What's New" section, Experiment Tracking and Model Deployment capability rows Closes #13, #12, #11 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-06-01T16:51:41Z

Warning

Review limit reached

@SahilKumar75, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 26 minutes and 36 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d1225661-c64d-47a9-bdd9-672c6cae8a33

📥 Commits

Reviewing files that changed from the base of the PR and between 9866cca and 4828a57.

⛔ Files ignored due to path filters (1)

poetry.lock is excluded by !**/*.lock

📒 Files selected for processing (25)

.github/workflows/ci.yml
app/api/datasets_routes.py
app/api/deps.py
app/api/jobs_routes.py
app/api/models_routes.py
app/components/finetune/progress_bar.py
app/components/finetune/shared.py
app/components/finetune/step1_model.py
app/components/finetune/step2_intent.py
app/components/finetune/step3_data.py
app/components/finetune/step4_configure.py
app/pages/datasets.py
app/pages/finetune.py
app/state/experiment_state.py
app/state/finetune_state.py
hf_spaces/entrypoint.sh
hf_spaces/nginx.conf
tests/test_api.py
tests/test_evaluate.py
tests/test_workers.py
trainer/dataset.py
trainer/finetune.py
trainer/merge.py
workers/merge_task.py
workers/train_task.py

📝 Walkthrough

Walkthrough

TuneOS transitions from a monolithic 5-step finetune page to a modular 7-step wizard architecture with REST APIs, SQLite experiment tracking, and enhanced deployment. The backend is refactored from a single app/api.py into composable routers (system, models, datasets, jobs, experiments), and the training orchestration moves to REST + Redis pubsub. The frontend adds dedicated UI components for each step, new dataset discovery backed by API calls, and a reworked state model that coordinates REST job creation, async training progress, experiment persistence, and multi-target deployment.

Changes

Complete TuneOS Modernization

Layer / File(s)	Summary
API Schemas & Shared Dependencies `app/api/schemas.py`, `app/api/deps.py`, `app/api/__init__.py`	Defines all Pydantic request/response models (JobConfig, JobStatus, DatasetGenRequest, etc.) and shares Redis/GPU detection/model metadata via `deps.py`. The new `__init__.py` mounts routers and exports `app_api`.
System & Model Management Endpoints `app/api/system.py`, `app/api/models_routes.py`	Health checks, GPU detection, model list, and model validation endpoints backed by shared helpers.
Dataset Discovery & Synthetic Generation `app/api/datasets_routes.py`	Endpoints for Hugging Face Hub search, dataset preview (first 5 rows), and synthetic data generation using self-instruct, templates, and diversity scoring.
Experiment Tracking Persistence `app/state/experiment_state.py`, `app/api/experiments_routes.py`	SQLite-backed run storage with ExperimentRun Pydantic model, load/delete/select operations, and REST endpoints for listing and deleting experiments.
Training Job Lifecycle & Inference `app/api/jobs_routes.py`	Full job management: creation/status/cancellation, artifact downloads, Hugging Face pushes, adapter merging, GGUF export, GitHub pushes, training commentary, evaluation retrieval, and adapter-based inference with per-job caching.
Backend Trainer Updates `trainer/config.py`, `trainer/dataset.py`, `trainer/loader.py`, `trainer/finetune.py`, `trainer/merge.py`, `trainer/callbacks.py`	Extended ModelConfig with HF token and local model path; flexible dataset loading (HF Hub + local + column mapping); model loading with token support; new merge and GGUF export functions; enhanced callbacks with GPU metrics and progress reporting.
Async Worker Tasks `workers/train_task.py`, `workers/merge_task.py`	Celery tasks for finetune with Redis progress pubsub, adapter merge, GGUF quantization, and GitHub pushes with Git LFS, plus optional `spaces.GPU` decoration.
Frontend UI Component Library `app/components/finetune/progress_bar.py`, `app/components/finetune/shared.py`, `app/components/loss_chart.py`	Reusable Reflex components: step progress bar with connectors, card/label/heading primitives, navigation buttons, status badges, preview tables, and multi-series loss chart.
7-Step Finetune Wizard Components `app/components/finetune/step1_model.py`, `step2_intent.py`, `step3_data.py`, `step4_configure.py`	UI for Steps 1–4: model selection (hub/custom/local), intent capture, data ingestion (upload/hub/generate), and hyperparameter configuration (simple/advanced modes).
Complete Finetune Page `app/pages/finetune.py`	Full 7-step workflow with Steps 5–7: training dashboard (metrics/loss chart/commentary), results & evaluation (with chat testing and experiment comparison), and deployment (adapter/merged/HF/GGUF/GitHub targets).
Dataset Discovery Page `app/pages/datasets.py`	Refactored from static grid to HF Hub–style discovery with search input, result cards, preview panel, and API-backed DatasetState.
Frontend State Orchestration `app/state/finetune_state.py`	Complete rewrite for 7-step flow: REST job creation, Redis pubsub progress listening, loss/epoch tracking, automatic experiment persistence, evaluation polling, and deployment orchestration.
Pages & State Cleanup `app/pages/results.py`, `app/pages/training.py`, `app/state/job_state.py`, `app/state/model_state.py`, `rxconfig.py`	Redirect legacy pages to `/finetune`, deprecate JobState as a stub, remove training orchestration from ModelState, and disable sitemap plugin.
Documentation `README.md`	Add "What's New" section covering 7-step wizard, SQLite experiment tracking, and deployment capabilities.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

SahilKumar75/TuneOS#33: Introduces the first version of modular FastAPI routers (system, models, datasets, jobs) that this PR completely refactors, modularizes further, and extends with experiment tracking, deployment, and a reworked 7-step frontend.

Poem

🐰 Seven steps now grace the meadow fine,
Where models merge and GGUF align,
Redis whispers progress through the night,
SQLite keeps experiments bright!
From REST to chat, deployment takes flight—
TuneOS hops to a future so right! ✨

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/open-source-refactor

pyproject.toml had diverged from the lock file, causing all CI jobs to fail at the dependency install step. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Rename loop variable i → _i (B007) - Add strict=False to zip() call (B905, auto-fix) - Sort/organise import blocks (I001, auto-fix) - Remove unused import (F401, auto-fix) - Apply ruff format across 19 files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- test_api: dataset_path is now optional (hub dataset jobs omit it); rename test to reflect correct expected behaviour (201 not 422) - test_workers: finetune() returns (output_path, model, tokenizer); fix mock return_value to be a 3-tuple so unpacking doesn't raise Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Docker build was failing because the root Dockerfile referenced hf_spaces/nginx.conf and hf_spaces/entrypoint.sh which didn't exist. - Add hf_spaces/nginx.conf: nginx proxies port 7860 → Reflex 3000 / API 8000 - Add hf_spaces/entrypoint.sh: full-stack boot (Redis, Celery, Reflex, nginx) test_evaluate.py was failing because only 'torch' was mocked but trainer/evaluate.py imports from torch.utils.data. Python's import machinery requires all submodule names to be present in sys.modules when the parent is a MagicMock (not a real package). - Add torch.utils, torch.utils.data, torch.nn, torch.cuda to mock chain All 41 tests pass (excluding test_pyqt.py which requires a display). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 7

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

🟠 Major comments (24)

app/api/deps.py-28-33 (1)

28-33: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't collapse backend failures into "unknown".

Line 32 turns Redis/import failures into the same payload as a genuinely missing job. That makes UI error handling and operator debugging much harder, especially for the clear-error-surfacing objective.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/api/deps.py` around lines 28 - 33, The current _get_job_status_from_redis
function swallows all exceptions and returns the same {"status": "unknown",
"job_id": job_id} for import/Redis failures and for genuinely missing jobs;
change it so backend/import/Redis exceptions are surfaced separately by catching
only the "job not found" case (or checking for None) and returning
{"status":"unknown","job_id":job_id} only when the job truly doesn't exist,
while other exceptions (ImportError, redis.ConnectionError, etc.) should be
caught and returned as an error payload such as
{"status":"error","job_id":job_id,"error": str(e)} or re-raised; locate the
logic in _get_job_status_from_redis and references to get_job_status to
implement this distinction and include the exception message for operator
debugging.

app/api/datasets_routes.py-48-57 (1)

48-57: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle datasets that don’t define a train split
Hardcoding split="train[:5]" can raise for Hub datasets that only expose other splits (e.g., validation/test). Before calling load_dataset, list available split names with datasets.get_dataset_split_names(...) for the dataset/config and load a split that exists (prefer train if present, otherwise fall back to the first available), then apply [:5] to that chosen split.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/api/datasets_routes.py` around lines 48 - 57, The _load function
currently hardcodes split="train[:5]" which fails for datasets without a train
split; update _load to first call datasets.get_dataset_split_names(dataset_id,
config_name=...) (or get_dataset_config_names if needed) to retrieve available
splits, pick "train" if present otherwise the first available split, then call
load_dataset with f"{chosen_split}[:5]" (keep trust_remote_code=True) to load
only 5 rows; keep the rest of the logic that builds rows and ensure the outer
await asyncio.get_event_loop().run_in_executor(None, _load) still calls the
updated _load.

app/api/deps.py-11-13 (1)

11-13: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make shared storage paths absolute.

OUTPUT_DIR and DATASET_DIR are resolved relative to the current working directory. The API process and background workers can start from different directories, so the same run can write artifacts to one location and later look them up in another.

Suggested fix

+from pathlib import Path
+
+_BASE_DIR = Path(__file__).resolve().parents[2]
+
 REDIS_URL = os.getenv("REDIS_URL", "redis://localhost:6379/0")
-OUTPUT_DIR = os.getenv("OUTPUT_DIR", "./outputs")
-DATASET_DIR = os.getenv("DATASET_DIR", "./storage/datasets")
+OUTPUT_DIR = str(Path(os.getenv("OUTPUT_DIR", str(_BASE_DIR / "outputs"))).resolve())
+DATASET_DIR = str(Path(os.getenv("DATASET_DIR", str(_BASE_DIR / "storage" / "datasets"))).resolve())

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/api/deps.py` around lines 11 - 13, OUTPUT_DIR and DATASET_DIR are
currently relative and can differ between processes; change their resolution to
absolute paths by converting the environment values to absolute (e.g., use
os.path.abspath or Path(...).resolve()) when assigning OUTPUT_DIR and
DATASET_DIR so both the API and workers reference the same filesystem location;
update the assignments that set OUTPUT_DIR and DATASET_DIR to wrap the os.getenv
result with the absolute-resolution call and optionally ensure directories exist
(mkdir(parents=True, exist_ok=True)) after resolution.

app/api/datasets_routes.py-71-76 (1)

71-76: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Surface self-instruct failures instead of silently templating.

When _self_instruct_generate fails here, the route quietly switches to _template_generate and still returns success. Users who explicitly chose HF-backed generation will get a different dataset than requested with no warning, and token/model misconfigurations stay hidden.

Suggested fix

         if hf_token and req.method in ("self_instruct", "few_shot"):
             try:
                 samples = _self_instruct_generate(req.user_intent, req.n_samples,
                                                    req.seed_examples, hf_token)
-            except Exception:
-                samples = []
+            except Exception as exc:
+                return {
+                    "samples": [],
+                    "dataset_path": "",
+                    "stats": {
+                        "total_generated": 0,
+                        "final_count": 0,
+                        "diversity_score": 0.0,
+                    },
+                    "warning": f"{req.method} generation failed: {exc}",
+                }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/api/datasets_routes.py` around lines 71 - 76, The route currently
swallows errors from _self_instruct_generate and silently falls back to
_template_generate; change this so failures are surfaced: when hf_token is
present and req.method is "self_instruct" or "few_shot", catch exceptions from
_self_instruct_generate, log the exception (including error details) and return
a proper error response (or re-raise) instead of setting samples=[] and
continuing to call _template_generate; update the datasets route handler around
the hf_token branch to propagate the error to the client so token/model
misconfigurations are visible and no silent fallback occurs.

trainer/config.py-11-13 (1)

11-13: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Validate model_source early.

This is a free-form string today, so a typo can silently fall through to the wrong loading path downstream. Please fail fast on unsupported values instead of relying on the inline comment for enforcement.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@trainer/config.py` around lines 11 - 13, Add early validation for the
model_source config so typos fail fast: in trainer/config.py replace or augment
the free-form model_source field (model_source) by enforcing allowed values
("hub", "local", "custom_string") — e.g., convert model_source to an Enum or add
a __post_init__/validator on the config class that checks model_source and
raises ValueError on unsupported values; ensure hf_token and local_model_path
remain unchanged but downstream consumers now can assume model_source is one of
the allowed options.

trainer/loader.py-13-17 (1)

13-17: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject nonexistent local model paths.

When model_source == "local" and the path is wrong, Line 28 flips local_files_only to False, so from_pretrained starts Hub resolution instead of failing locally. That turns a simple path bug into a misleading load error.

Suggested guard

 def _resolve_model_path(cfg: ModelConfig) -> str:
     """Return the model identifier to pass to from_pretrained."""
     if cfg.model_source == "local" and cfg.local_model_path:
         return cfg.local_model_path
     return cfg.model_name

 def load_model_and_tokenizer(cfg: ModelConfig):
@@
     model_path = _resolve_model_path(cfg)
     token = cfg.hf_token or os.getenv("HF_TOKEN") or None
-    local_only = cfg.model_source == "local" and os.path.exists(model_path)
+    if cfg.model_source == "local" and not os.path.exists(model_path):
+        raise FileNotFoundError(f"Local model path does not exist: {model_path}")
+    local_only = cfg.model_source == "local"

Also applies to: 28-28

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@trainer/loader.py` around lines 13 - 17, The _resolve_model_path function
must reject nonexistent local paths instead of returning model_name; when
ModelConfig.model_source == "local" and cfg.local_model_path is provided, verify
the path exists (and is a directory/file as expected) and raise a clear
ValueError if it does not; keep returning cfg.local_model_path only when the
path is valid so downstream logic (e.g., setting local_files_only) won’t flip to
remote Hub resolution unexpectedly.

trainer/merge.py-90-100 (1)

90-100: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not silently downgrade a quantized export to f16.

If the quantizer binary is missing, this returns model-f16.gguf and the caller still records the request as a successful quant_type export. That breaks the deploy contract and can produce a much larger artifact than the UI asked for.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@trainer/merge.py` around lines 90 - 100, The code currently falls back to
returning f16_path when no quantizer binary is found, which silently misreports
a quantized export; change the behavior in the block that checks quantize_bin
(symbols: quantize_bin, shutil.which("llama-quantize")/shutil.which("quantize"),
f16_path, gguf_path, quant_type, subprocess.run) so that if quantize_bin is None
you raise an explicit exception (e.g., RuntimeError or a custom ExportError)
explaining the quantizer is missing and include quant_type in the message,
rather than returning f16_path — this ensures the caller sees a failed
quantization and can handle/report it instead of recording a successful
quantized export.

trainer/dataset.py-45-49 (1)

45-49: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast when the mapped dataset columns are missing.

If instruction_col or output_col is wrong, Lines 46-49 skip renaming and format_prompt falls back to empty strings. That silently trains on blank prompts/responses instead of surfacing a bad mapping.

Suggested guard

     # Normalise column names so format_prompt always sees "instruction" / "output"
+    if instruction_col not in raw.column_names:
+        raise ValueError(f"Instruction column '{instruction_col}' not found in dataset")
+    if output_col not in raw.column_names:
+        raise ValueError(f"Output column '{output_col}' not found in dataset")
+
     if instruction_col != "instruction" and instruction_col in raw.column_names:
         raw = raw.rename_column(instruction_col, "instruction")
     if output_col != "output" and output_col in raw.column_names:
         raw = raw.rename_column(output_col, "output")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@trainer/dataset.py` around lines 45 - 49, The rename block silently ignores
missing mapped columns causing format_prompt to see empty instruction/output;
instead validate presence of instruction_col and output_col on raw.column_names
and raise a clear error if either is missing before calling raw.rename_column
(referencing variables instruction_col, output_col and method
raw.rename_column). Ensure you only attempt raw.rename_column when the column
exists, but if a provided mapping is not found throw an exception with a message
identifying the missing column(s) so callers must fix the mapping.

trainer/callbacks.py-49-57 (1)

49-57: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make Redis publishing best-effort.

A Redis outage here will raise out of the callback and can fail the whole fine-tune even though the model training itself is still healthy. Progress streaming should not sit on the critical path for job execution.

Suggested hardening

-        self.redis.publish(self.channel, json.dumps(payload))
+        try:
+            self.redis.publish(self.channel, json.dumps(payload))
+        except Exception:
+            pass
@@
-        self.redis.publish(self.channel, json.dumps(payload))
+        try:
+            self.redis.publish(self.channel, json.dumps(payload))
+        except Exception:
+            pass

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@trainer/callbacks.py` around lines 49 - 57, Wrap Redis publishing in
on_train_end (and similarly where self.redis.publish is used) in a safe
try/except so a Redis outage won’t propagate and fail the training; catch broad
exceptions around self.redis.publish(self.channel, json.dumps(payload)), log the
error via the existing logger (or processLogger) with context (e.g., "failed to
publish training status to Redis" and include exception info) and continue
without re-raising; ensure the same pattern is applied to other publish calls
(e.g., in the method that sets self._start_time and any progress publishes) so
publishing is best-effort only.

app/pages/datasets.py-279-291 (1)

279-291: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make the category sidebar actually filter datasets.

Clicking a category only changes the active styling. selected_category is never used when rendering STARTER_DATASETS or when building the search request, so the sidebar is a no-op.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/pages/datasets.py` around lines 279 - 291, The category sidebar only
toggles styling because selected_category is never used and on_click currently
invokes DatasetState.set_category immediately; update _category_item to pass a
callable (e.g., on_click=lambda *_: DatasetState.set_category(label)) so
clicking sets state at click time, then use DatasetState.selected_category when
rendering STARTER_DATASETS (filter the list by matching category) and include
selected_category in the search request parameters where you build requests
(e.g., the function that constructs/searches datasets such as
build_search_request or fetch_datasets) so the UI actually filters results.

app/api/jobs_routes.py-78-89 (1)

78-89: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Do not serialize HF/GitHub tokens into Celery task kwargs.

Passing tokens in the task payload stores them in the broker/backend and often exposes them to task inspection and logs. That unnecessarily broadens secret access beyond the API process.

Also applies to: 179-184, 238-243

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/api/jobs_routes.py` around lines 78 - 89, The task payload currently
serializes sensitive HF/GitHub tokens into run_finetune.apply_async kwargs (and
likewise at the other occurrences), which risks exposing secrets in the
broker/backend; remove any token fields from the kwargs passed to
run_finetune.apply_async and instead have the worker retrieve credentials
securely at runtime (e.g., fetch from a secrets store, environment variable, or
a secure DB keyed by job_id) inside the run_finetune task handler, using job_id
(and existing model_cfg/lora_cfg/train_cfg identifiers) to look up any per-job
secrets; update the other apply_async calls you referenced (the ones at the
other occurrences) to follow the same pattern so tokens are never serialized in
Celery messages.

app/state/experiment_state.py-53-70 (1)

53-70: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Persist loss_history through reloads.

save_experiment_run() writes loss_history, but the restored model drops it entirely, so reloaded experiments lose the epoch/loss series this feature is supposed to preserve.

Suggested fix

 class ExperimentRun(BaseModel):
     id: str = ""
     name: str = ""
     model_id: str = ""
     model_source: str = "hub"
     technique: str = "qlora"
     epochs: int = 3
     learning_rate: str = "2e-4"
     lora_r: int = 16
     batch_size: int = 4
     dataset_name: str = ""
     user_intent: str = ""
     final_loss: float = 0.0
     perplexity: float = 0.0
     started_at: str = ""
     finished_at: str = ""
     status: str = "unknown"
     output_path: str = ""
+    loss_history: list[dict[str, Any]] = []
@@
                 ExperimentRun(
                     id=r["id"],
                     name=r["name"] or "",
                     model_id=r["model_id"] or "",
                     model_source=r["model_source"] or "hub",
                     technique=r["technique"] or "qlora",
                     epochs=r["epochs"] or 3,
                     learning_rate=r["learning_rate"] or "2e-4",
                     lora_r=r["lora_r"] or 16,
                     batch_size=r["batch_size"] or 4,
                     dataset_name=r["dataset_name"] or "",
                     user_intent=r["user_intent"] or "",
                     final_loss=r["final_loss"] or 0.0,
                     perplexity=r["perplexity"] or 0.0,
                     started_at=r["started_at"] or "",
                     finished_at=r["finished_at"] or "",
                     status=r["status"] or "unknown",
                     output_path=r["output_path"] or "",
+                    loss_history=json.loads(r["loss_history"] or "[]"),
                 )

Also applies to: 94-113

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/state/experiment_state.py` around lines 53 - 70, The ExperimentRun model
is missing a persisted loss_history so reloads drop epoch/loss series; add a
loss_history field (e.g., List[float] or List[dict] depending on how epochs are
represented) to the ExperimentRun Pydantic model and to the other run model
defined later, then update save_experiment_run and the corresponding
load/restore routine to include loss_history in serialization and
deserialization so saved runs restore their epoch/loss series intact.

app/api/jobs_routes.py-127-136 (1)

127-136: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Avoid in-memory ZIP assembly for model artifacts.

BytesIO forces the entire archive to live in RAM before the response starts. That is especially risky for /download-merged, where the artifact can be multi-GB and take down the API worker.

Also applies to: 199-208

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/api/jobs_routes.py` around lines 127 - 136, The current code builds the
ZIP in-memory using io.BytesIO (buf) and zipfile.ZipFile which risks OOM for
multi-GB artifacts; change to produce the ZIP on-disk or stream it incrementally
instead: create a temporary file (e.g., tempfile.NamedTemporaryFile) or use a
streaming zip iterator and open ZipFile writing to that file/stream while
yielding chunks to the client, then return a StreamingResponse or FileResponse
that streams the temporary ZIP with the same headers; update the adapter
download path that currently references adapter_dir, job_id and zipfile.ZipFile
to use the on-disk/temp-file or chunked generator approach and ensure the temp
file is cleaned up after streaming.

app/state/experiment_state.py-139-172 (1)

139-172: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not silently ignore failed experiment writes.

If SQLite or JSON serialization fails here, the caller still proceeds as if the run was saved. That turns DB issues into silent data loss for a core persistence path.

Suggested fix

 def save_experiment_run(run_data: dict[str, Any]):
     """Called from FinetuneState._save_experiment_record() — writes to SQLite."""
-    try:
-        _init_db()
-        with _get_conn() as conn:
-            conn.execute(
-                """
-                INSERT OR REPLACE INTO runs
-                (id, name, model_id, model_source, technique, epochs, learning_rate,
-                 lora_r, batch_size, dataset_name, user_intent, final_loss, perplexity,
-                 started_at, finished_at, status, output_path, loss_history)
-                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
-                """,
-                (
-                    run_data.get("id", ""),
-                    run_data.get("name", ""),
-                    run_data.get("model_id", ""),
-                    run_data.get("model_source", "hub"),
-                    run_data.get("technique", "qlora"),
-                    run_data.get("epochs", 3),
-                    run_data.get("learning_rate", "2e-4"),
-                    run_data.get("lora_r", 16),
-                    run_data.get("batch_size", 4),
-                    run_data.get("dataset_name", ""),
-                    run_data.get("user_intent", ""),
-                    run_data.get("final_loss", 0.0),
-                    run_data.get("perplexity", 0.0),
-                    run_data.get("started_at", ""),
-                    run_data.get("finished_at", ""),
-                    run_data.get("status", "unknown"),
-                    run_data.get("output_path", ""),
-                    json.dumps(run_data.get("loss_history", [])),
-                ),
-            )
-    except Exception:
-        pass
+    _init_db()
+    with _get_conn() as conn:
+        conn.execute(
+            """
+            INSERT OR REPLACE INTO runs
+            (id, name, model_id, model_source, technique, epochs, learning_rate,
+             lora_r, batch_size, dataset_name, user_intent, final_loss, perplexity,
+             started_at, finished_at, status, output_path, loss_history)
+            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+            """,
+            (
+                run_data.get("id", ""),
+                run_data.get("name", ""),
+                run_data.get("model_id", ""),
+                run_data.get("model_source", "hub"),
+                run_data.get("technique", "qlora"),
+                run_data.get("epochs", 3),
+                run_data.get("learning_rate", "2e-4"),
+                run_data.get("lora_r", 16),
+                run_data.get("batch_size", 4),
+                run_data.get("dataset_name", ""),
+                run_data.get("user_intent", ""),
+                run_data.get("final_loss", 0.0),
+                run_data.get("perplexity", 0.0),
+                run_data.get("started_at", ""),
+                run_data.get("finished_at", ""),
+                run_data.get("status", "unknown"),
+                run_data.get("output_path", ""),
+                json.dumps(run_data.get("loss_history", [])),
+            ),
+        )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/state/experiment_state.py` around lines 139 - 172, The current try/except
around the DB insert (using _init_db(), _get_conn(), and run_data into the runs
table) swallows all exceptions causing silent data loss; change it to catch
Exception as e, log the error (including e and identifying
run_data.get("id")/name) using the module logger, and then propagate the failure
to the caller (re-raise the exception or return a failure status) so callers do
not proceed as if the run was saved; keep the insert logic and JSON
serialization for loss_history but do not suppress errors silently.

app/api/experiments_routes.py-13-15 (1)

13-15: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the same DB path source as ExperimentState.

This fallback points to ./storage/experiments.db, while app/state/experiment_state.py writes to a project-root absolute path. If the API starts from a different working directory, /experiments will query and delete from a different database.

Suggested fix

-def _get_db_path() -> str:
-    return os.getenv("EXPERIMENT_DB", "./storage/experiments.db")
+def _get_db_path() -> str:
+    from app.state.experiment_state import DB_PATH
+    return DB_PATH

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/api/experiments_routes.py` around lines 13 - 15, The _get_db_path
function currently falls back to a relative "./storage/experiments.db" which
differs from ExperimentState's absolute project-root path; update _get_db_path
to obtain the DB path from the same source as ExperimentState (e.g., import
ExperimentState and call its public accessor or constant such as
ExperimentState.get_db_path() or ExperimentState.DB_PATH) so both API routes and
ExperimentState target the identical absolute path instead of a relative
fallback.

app/pages/finetune.py-1137-1173 (1)

1137-1173: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

This table is showing all completed runs, not this setup's runs.

ExperimentState.completed_runs only filters on status == "done" in app/state/experiment_state.py:84-85. The "Past runs with this setup" table therefore mixes unrelated models, datasets, and techniques, which makes the comparison misleading. Filter against the current run configuration before rendering it.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/pages/finetune.py` around lines 1137 - 1173, The table currently iterates
over ExperimentState.completed_runs (which only filters by status) and thus
shows unrelated runs; change the rendering to first compute a filtered list of
completed runs that match the current run's setup (compare identifying fields on
ExperimentState.current_run such as model_id, dataset_id, and any
setup_hash/technique/hyperparams) and pass that filtered list to rx.foreach
instead of ExperimentState.completed_runs; e.g., produce a filtered_runs =
ExperimentState.completed_runs.filter(lambda r: r.model_id ==
ExperimentState.current_run.model_id and r.dataset_id ==
ExperimentState.current_run.dataset_id [and r.setup_hash ==
ExperimentState.current_run.setup_hash or compare technique/hyperparams as
needed]) and use filtered_runs in the rx.foreach call that builds the table
rows.

app/pages/finetune.py-1107-1115 (1)

1107-1115: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix on_key_down to not block typing on non-Enter keys

rx.prevent_default is executed for every non-"Enter" key, which cancels the browser’s default keydown behavior and can prevent characters from being inserted into the input. Reflex’s on_key_down handler receives the pressed key as a string, so keep the "Enter" check but apply rx.prevent_default only in the Enter branch (or remove it entirely) and let other keys fall through.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/pages/finetune.py` around lines 1107 - 1115, The on_key_down handler on
the rx.input is calling rx.prevent_default for every non-"Enter" key which
blocks normal typing; update the lambda so rx.prevent_default is only invoked
when the key is "Enter" (or remove it entirely) and let other keys fall through
to normal behavior; locate the rx.input using FinetuneState.chat_input,
FinetuneState.set_chat_input and FinetuneState.send_test_chat and adjust the
on_key_down conditional so the Enter branch runs send_test_chat (and optionally
prevent_default) while the non-Enter branch does nothing.

app/state/finetune_state.py-717-725 (1)

717-725: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Save the experiment after auto-eval finishes, not before it starts.

This persists the run before _auto_eval() populates self.eval_perplexity, so successful runs are stored with a placeholder perplexity of 0.0.

Also applies to: 766-787

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/state/finetune_state.py` around lines 717 - 725, The experiment record is
being saved before `_auto_eval()` runs so `self.eval_perplexity` remains at its
placeholder; update the flow in the block that checks `self.training_status ==
"done"` (and the similar block around the second occurrence) to run `await
self._auto_eval()` first, wait for it to complete and populate
`self.eval_perplexity`, then call `await self._save_experiment_record()`; keep
the `async with self:` that sets `self.current_step = 6` as needed but ensure
the final save happens after `_auto_eval()` finishes so persisted records
contain the real perplexity.

app/state/finetune_state.py-333-344 (1)

333-344: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Normalize Hub preview rows to the shared table schema before storing them.

_preview_table only reads instruction and output, but this code keeps the API's raw preview rows. Datasets with different column names will render fine in /datasets and then break here.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/state/finetune_state.py` around lines 333 - 344, The hub preview rows are
stored raw but must be normalized to the shared table schema so _preview_table
(which expects "instruction" and "output") won't break; after parsing data in
the block that sets self.hub_dataset_columns and before assigning
self.hub_dataset_preview, map each row (using the returned columns list from
data.get("columns", [])) into a normalized dict with keys "instruction" and
"output" by locating the column indices for the detected instruction/output
names (or falling back to
self.hub_dataset_instruction_col/self.hub_dataset_output_col) and building rows
like {"instruction": row[idx_inst], "output": row[idx_out]} (or empty string if
missing); then assign that normalized list to self.hub_dataset_preview and
continue setting self.hub_dataset_instruction_col and
self.hub_dataset_output_col as now.

app/state/finetune_state.py-418-435 (1)

418-435: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Persist the detected upload column mapping into the training payload.

The preview remaps arbitrary upload columns to instruction/output, but the job request still sends the Hub column fields. That means a local CSV/JSON upload with columns like prompt and answer will preview fine here and then fail once trainer/finetune.py tries to read instruction_col / output_col.

Also applies to: 602-606
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/state/finetune_state.py` around lines 418 - 435, The preview remapping
detects which upload columns map to instruction/output but doesn't persist that
mapping into the training job payload; update the state to store the detected
names (e.g., set properties like self.instruction_col and self.output_col or
equivalent fields used when building the job request) in the same block that
computes inst_col/out_col so the trainer uses those values; also apply the same
persistence change in the later block around the other preview code (lines
referenced in the review) so both CSV/JSON upload paths include the mapped
column names when constructing the payload sent to trainer/finetune.py.

app/state/finetune_state.py-293-295 (1)

293-295: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

ZIP uploads need to be extracted before treating the directory as a model path.

For .zip files this stores the archive, then points local_model_path at the parent directory without unpacking it. The training path will not contain model files yet.

Proposed fix

+        import zipfile
+
-        self.local_model_path = dest_dir if file.filename.endswith(".zip") else dest_path
+        if file.filename.endswith(".zip"):
+            with zipfile.ZipFile(dest_path) as archive:
+                archive.extractall(dest_dir)
+            self.local_model_path = dest_dir
+        else:
+            self.local_model_path = dest_path

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/state/finetune_state.py` around lines 293 - 295, The current assignment
in FinetuneState (setting self.local_model_path = dest_dir if
file.filename.endswith(".zip") else dest_path) points to the ZIP archive's
parent directory without extracting it; update the upload handling so that when
file.filename endswith(".zip") you extract the ZIP into dest_dir (e.g., using
Python's zipfile) and then set self.local_model_path to the extraction directory
(not the zip or its parent), keeping self.model_source = "local" and
self.is_validating_model = False; modify the logic around dest_dir/dest_path and
the ZIP branch in the method that sets local_model_path to ensure training
receives the extracted model files.

app/state/finetune_state.py-578-635 (1)

578-635: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard start_training against duplicate submissions.

There is no re-entry check here, so a fast double-click can POST multiple jobs before the UI reflects is_starting.

Proposed fix

 async def start_training(self):
-        if not self.can_start_training:
+        if self.is_starting or self.training_status == "running" or not self.can_start_training:
             return

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/state/finetune_state.py` around lines 578 - 635, The start_training
method can be re-entered before the UI/state updates, so add an early atomic
re-entry guard: inside start_training, acquire the existing instance lock (async
with self) immediately, check if self.is_starting or self.training_status ==
"running" (or !self.can_start_training) and return if true, then set
self.is_starting = True and other initial fields (experiment_id/name,
training_start_time, etc.) while still holding the lock to prevent a race;
release the lock and proceed to build payload and POST. Also ensure all
exception paths reset self.is_starting = False (and set start_error) inside the
lock so a failed attempt doesn't permanently block future starts. Use the
existing symbols start_training, is_starting, can_start_training,
training_status, and the async with self lock to implement this.

app/state/finetune_state.py-660-699 (1)

660-699: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle terminal pubsub messages before appending loss samples.

The Redis done payload does not include loss or epoch, but this loop still appends a zero-valued point before breaking. That will reset the visible chart tail and poison the saved final_loss.

Proposed fix

-            current_loss = data.get("loss", 0)
-            current_epoch = data.get("epoch", 0)
+            status = data.get("status")
+            if status in ("done", "failed"):
+                async with self:
+                    self.training_status = status
+                break
+
+            if "loss" not in data:
+                continue
+
+            current_loss = data["loss"]
+            current_epoch = data.get("epoch", 0)
@@
-            if data.get("status") in ("done", "failed"):
-                async with self:
-                    self.training_status = data.get("status", "done")
-                break

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/state/finetune_state.py` around lines 660 - 699, The loop currently
computes current_loss/current_epoch and immediately appends to self.loss_history
(see variables current_loss, current_epoch and the append into
self.loss_history) even when the pubsub payload is a terminal message with
status "done" or "failed", which causes a bogus zero-valued sample; move the
terminal-status check (data.get("status") in ("done","failed")) to occur before
computing/appending loss samples and before updating epoch/epoch_start_loss so
that when a terminal message arrives you only set self.training_status (via
async with self: self.training_status = ...) and break without mutating
loss_history, epoch_start_loss, prev_epoch, or other training metrics. Ensure
the rest of the epoch-boundary logic still runs for non-terminal messages.

app/components/finetune/step4_configure.py-35-72 (1)

35-72: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Hide the simple form when Advanced mode is enabled.

The toggle currently only adds the advanced card; it never removes the simple one, so advanced mode renders both control sets at once.

Proposed fix

-        # Simple mode
-        _card(
-            rx.vstack(
-                rx.grid(
-                    rx.vstack(
-                        _label("Epochs"),
-                        rx.input(value=FinetuneState.epochs.to_string(),
-                                 on_change=FinetuneState.set_epochs,
-                                 type="number", width="100%"),
-                        rx.text("One full pass through your dataset",
-                                font_size="0.72rem", color=c("text_muted")),
-                        spacing="1",
-                    ),
-                    rx.vstack(
-                        _label("Learning rate"),
-                        rx.select.root(
-                            rx.select.trigger(width="100%"),
-                            rx.select.content(
-                                *[rx.select.item(f"{lr} — {desc}", value=lr)
-                                  for lr, desc in _LR_PRESETS],
-                            ),
-                            value=FinetuneState.learning_rate,
-                            on_change=FinetuneState.set_learning_rate,
-                        ),
-                        spacing="1",
-                    ),
-                    rx.vstack(
-                        _label("Technique"),
-                        rx.text(FinetuneState.technique_label, font_size="0.88rem",
-                                font_weight="500", color=c("accent")),
-                        rx.text("Change in Step 1", font_size="0.72rem", color=c("text_muted")),
-                        spacing="1",
-                    ),
-                    columns="3", spacing="4", width="100%",
-                ),
-                spacing="0",
-            )
-        ),
+        # Simple mode
+        rx.cond(
+            FinetuneState.ui_mode != "advanced",
+            _card(
+                rx.vstack(
+                    rx.grid(
+                        rx.vstack(
+                            _label("Epochs"),
+                            rx.input(value=FinetuneState.epochs.to_string(),
+                                     on_change=FinetuneState.set_epochs,
+                                     type="number", width="100%"),
+                            rx.text("One full pass through your dataset",
+                                    font_size="0.72rem", color=c("text_muted")),
+                            spacing="1",
+                        ),
+                        rx.vstack(
+                            _label("Learning rate"),
+                            rx.select.root(
+                                rx.select.trigger(width="100%"),
+                                rx.select.content(
+                                    *[rx.select.item(f"{lr} — {desc}", value=lr)
+                                      for lr, desc in _LR_PRESETS],
+                                ),
+                                value=FinetuneState.learning_rate,
+                                on_change=FinetuneState.set_learning_rate,
+                            ),
+                            spacing="1",
+                        ),
+                        rx.vstack(
+                            _label("Technique"),
+                            rx.text(FinetuneState.technique_label, font_size="0.88rem",
+                                    font_weight="500", color=c("accent")),
+                            rx.text("Change in Step 1", font_size="0.72rem", color=c("text_muted")),
+                            spacing="1",
+                        ),
+                        columns="3", spacing="4", width="100%",
+                    ),
+                    spacing="0",
+                )
+            ),
+            rx.fragment(),
+        ),

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/components/finetune/step4_configure.py` around lines 35 - 72, The
simple-mode _card should be rendered only when advanced mode is disabled; wrap
the simple form block (the _card(...) that contains the Epochs/LR/Technique
grid) in a conditional that checks the FinetuneState advanced flag (e.g., render
only if not FinetuneState.advanced or FinetuneState.is_advanced === false).
Update the JSX/reactive expression so the simple card is omitted when the
advanced toggle is on, using the same FinetuneState property used by the
advanced card/toggle to ensure it hides immediately when advanced mode is
enabled.

🟡 Minor comments (1)

app/pages/datasets.py-67-83 (1)
67-83: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Don't render the curated grid after an empty or failed search.

A zero-result search and any request error both leave search_results empty, so the UI falls straight back to the starter cards. That makes it look like those curated datasets matched the query instead of showing “no results” or “search failed.”

Also applies to: 344-365
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/pages/datasets.py` around lines 67 - 83, When a search completes
(success, zero-results, or exception), mark that a search was performed and
distinguish failures from no-results so the curated grid isn't shown; inside the
async search block update self.search_performed = True in the success and except
branches, set self.search_results to the returned list on 200, set
self.search_results = [] when the API returns zero results, and on exception set
self.search_error = True (and leave search_results as None or empty) while
always clearing self.is_searching; update the UI to check self.search_performed
/ self.search_error (instead of only truthiness of self.search_results) before
rendering the curated grid.

🧹 Nitpick comments (1)

app/pages/finetune.py (1)

28-149: ⚡ Quick win

Reuse the shared progress bar component instead of re-implementing it here.

app/components/finetune/progress_bar.py already defines the same _STEP_LABELS, _step_dot(), and _progress_bar() logic. Keeping a second copy in this page guarantees drift the next time the wizard steps or styles change.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/pages/finetune.py` around lines 28 - 149, This file duplicates the
progress bar implementation (_STEP_LABELS, _step_dot, _progress_bar) which
already lives in app/components/finetune/progress_bar.py; remove the local
copies and import the shared components instead. Replace the local symbols
_STEP_LABELS, _step_dot, and _progress_bar with imports from the shared module
(e.g., from app.components.finetune.progress_bar import _STEP_LABELS, _step_dot,
_progress_bar) and ensure any references to FinetuneState remain compatible; if
the shared module exposes different names, adapt the import names or use aliases
to match the existing usages in this file. Ensure no other local references rely
on removed helpers before committing.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b6b92205-a57a-4efe-8ffc-fbc488e83d5c

📥 Commits

Reviewing files that changed from the base of the PR and between 5d7d623 and 9866cca.

📒 Files selected for processing (34)

README.md
app/api.py
app/api/__init__.py
app/api/datasets_routes.py
app/api/deps.py
app/api/experiments_routes.py
app/api/jobs_routes.py
app/api/models_routes.py
app/api/schemas.py
app/api/system.py
app/components/finetune/progress_bar.py
app/components/finetune/shared.py
app/components/finetune/step1_model.py
app/components/finetune/step2_intent.py
app/components/finetune/step3_data.py
app/components/finetune/step4_configure.py
app/components/loss_chart.py
app/pages/datasets.py
app/pages/finetune.py
app/pages/results.py
app/pages/training.py
app/state/experiment_state.py
app/state/finetune_state.py
app/state/job_state.py
app/state/model_state.py
rxconfig.py
trainer/callbacks.py
trainer/config.py
trainer/dataset.py
trainer/finetune.py
trainer/loader.py
trainer/merge.py
workers/merge_task.py
workers/train_task.py

💤 Files with no reviewable changes (2)

app/api.py
app/state/model_state.py

coderabbitai · 2026-06-01T17:09:12Z

+                            rx.slider(min=4, max=128, step=4,
+                                      default_value=[FinetuneState.lora_r],
+                                      on_value_commit=FinetuneState.set_lora_r),


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Make the LoRA sliders controlled (use value instead of default_value).
In app/pages/finetune.py, the sliders at the cited lines use default_value=... (e.g., the one wiring on_value_commit=FinetuneState.set_lora_r), which leaves the thumb uncontrolled and only initializes from state once; it won’t reliably reflect updated FinetuneState after restore/back-navigation. Switch to value=[...] bound to the corresponding FinetuneState fields.

Also applies to the similar slider block around the other cited lines (730-732).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/pages/finetune.py` around lines 714 - 716, The LoRA sliders are using default_value which only initializes the thumb and won’t reflect later state changes; change the slider args to use value bound to the FinetuneState fields (e.g., replace default_value=[FinetuneState.lora_r] with value=[FinetuneState.lora_r]) for the slider wired to on_value_commit=FinetuneState.set_lora_r, and do the same for the other similar LoRA slider block that commits to a FinetuneState setter (bind value to the corresponding FinetuneState.<field> property so the slider stays controlled and updates on restore/navigation).

Deferred to a follow-up issue: the slider fix requires verifying the Reflex slider API for the installed version (rx.slider value vs default_value semantics differ across Reflex releases). The change is low-risk but needs a UI smoke test before merging. Filed as a follow-up.

@SahilKumar75 That's a sensible approach — Reflex's slider API has indeed shifted between minor versions, and a quick smoke test before committing to value=[...] is the right call. Glad it's tracked. Feel free to link the follow-up issue here when it's ready if you'd like me to keep an eye on it.

- datasets_routes: trust_remote_code=False for user-supplied dataset IDs - trainer/dataset: trust_remote_code=False + fail-fast on missing columns - jobs_routes: _resolve_job_dir() blocks path-traversal in all job_id joins - jobs_routes: per-job Lock + double-checked load for _INFER_CACHE - finetune_state: os.path.basename() on uploaded filenames (model + dataset) - finetune_state: extract ZIP uploads before setting local_model_path - merge_task: validate repo_url is https://github.com/, supply token via GIT_ASKPASS credential helper (never in URL/args), scrub token from errors - hf_spaces: add missing nginx.conf and entrypoint.sh (Docker build fix) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

GitHub-hosted runners only have ~14 GB free. Installing the full ML stack (torch, transformers, bitsandbytes, trl, peft, accelerate) fills it up mid-build. - Add jlumbroso/free-disk-space to reclaim ~25 GB (Android SDK, .NET, Haskell, large packages, cached Docker images, swap) - Change load: true → load: false — CI only needs to verify the build succeeds, not load the image into the local daemon Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…actor (#44) Squash-merges feat/open-source-refactor. All CI checks green.

SahilKumar75 and others added 4 commits June 1, 2026 22:30

fix: regenerate poetry.lock to fix stale lock file

9984a7e

pyproject.toml had diverged from the lock file, causing all CI jobs to fail at the dependency install step. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai Bot reviewed Jun 1, 2026

View reviewed changes

SahilKumar75 and others added 2 commits June 1, 2026 22:42

SahilKumar75 merged commit ed3d3d8 into main Jun 1, 2026
5 checks passed

SahilKumar75 deleted the feat/open-source-refactor branch June 1, 2026 18:37

SahilKumar75 added a commit that referenced this pull request Jun 1, 2026

feat: 7-step wizard, experiment tracking, deploy tab, API package ref…

d4ce073

…actor (#44) Squash-merges feat/open-source-refactor. All CI checks green.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: 7-step wizard, experiment tracking, deploy tab, API package refactor#44

feat: 7-step wizard, experiment tracking, deploy tab, API package refactor#44
SahilKumar75 merged 7 commits into
mainfrom
feat/open-source-refactor

SahilKumar75 commented Jun 1, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 1, 2026 •

edited

Loading

Uh oh!

SahilKumar75 Jun 1, 2026

Uh oh!

coderabbitai Bot Jun 1, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SahilKumar75 commented Jun 1, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Closes

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SahilKumar75 Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SahilKumar75 commented Jun 1, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 1, 2026 •

edited

Loading

coderabbitai Bot Jun 1, 2026 •

edited

Loading