Skip to content

Fix finetuning methods#56

Merged
RETR0-OS merged 10 commits intomainfrom
fix-finetuning-methods
Feb 15, 2026
Merged

Fix finetuning methods#56
RETR0-OS merged 10 commits intomainfrom
fix-finetuning-methods

Conversation

@RETR0-OS
Copy link
Member

Fixes DPO, PPO, LoRA and QLora streategies

The SFT strategy was using raw transformers.Trainer with manual
tokenization and data collation, duplicating what SFTTrainer does
automatically. Replace with TRL's SFTTrainer which handles
tokenization, label creation, and data collation natively from
a text field. Also use SFTConfig instead of TrainingArguments,
and pass peft_config to let SFTTrainer manage LoRA wrapping.
Uses processing_class parameter instead of deprecated tokenizer.
QLoRA strategy was duplicating SFT's broken manual tokenization
approach. Replace with TRL SFTTrainer + SFTConfig just like SFT,
but with QLoRA-optimized defaults: higher LoRA rank (64), bf16,
gradient checkpointing, and paged optimizer. SFTTrainer handles
tokenization and data collation automatically.
Replace deprecated tokenizer parameter with processing_class,
deprecated evaluation_strategy with eval_strategy. Add peft_config
parameter to let DPOTrainer handle LoRA wrapping instead of manual
get_peft_model calls. Extract get_peft_config method for consistency
with other strategies.
The RLHF strategy used the old PPOTrainer API (PPOConfig with
model_name, PPOTrainer with config= parameter) which no longer
exists in TRL 0.16+. PPO also requires a separate reward model
which users of a no-code tool won't have.

Replace with DPO (Direct Preference Optimization) which achieves
equivalent alignment results without a reward model. Uses the
same dataset format (prompt/chosen/rejected) and modern TRL API
with processing_class and peft_config parameters.
When using Unsloth, provider.prepare_for_training() applies LoRA
via FastLanguageModel.get_peft_model(). The strategies also pass
peft_config to SFTTrainer/DPOTrainer, which would try to wrap
the model again. Add _peft_already_applied flag to config when
Unsloth prepares the model, and skip peft_config in create_trainer
when this flag is set.
ModelForge runs entirely locally, so asking users to "upload" a
dataset makes no sense - the file already exists on their system.
Replace the /load_settings POST endpoint (which accepted multipart
file upload) with /validate_dataset_path (which accepts a local
file path and validates it exists and is JSON/JSONL). Remove unused
imports for UploadFile, File, Form, uuid, FileManager.
Since ModelForge runs locally, replace the "Upload Dataset" file
picker with a text input where users enter the path to their local
dataset file. The frontend now calls validateDatasetPath instead
of uploadDataset, sending the path as JSON to the new backend
endpoint. This avoids unnecessary file copying.
Copilot AI review requested due to automatic review settings February 11, 2026 05:31
@RETR0-OS RETR0-OS changed the title [WIPFix finetuning methods [WIP] Fix finetuning methods Feb 11, 2026
@RETR0-OS RETR0-OS marked this pull request as draft February 11, 2026 05:31
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates ModelForge’s fine-tuning pipeline to use TRL’s newer trainer/config APIs (SFT/DPO) and adjusts the UI + API flow to validate local dataset paths instead of uploading datasets.

Changes:

  • Refactors SFT/QLoRA/DPO/RLHF strategies to rely on TRL trainers with SFTConfig/DPOConfig and pass LoRA via peft_config (avoiding double-wrapping with Unsloth).
  • Replaces dataset upload with a local-path validation endpoint and updates the frontend to submit dataset paths.
  • Packages the built frontend assets and updates backend CORS configuration.

Reviewed changes

Copilot reviewed 18 out of 23 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pyproject.toml Includes packaged frontend build assets in the Python distribution.
ModelForge/strategies/sft_strategy.py Migrates SFT to TRL SFTTrainer/SFTConfig and PEFT config handoff.
ModelForge/strategies/rlhf_strategy.py Reworks RLHF flow to use DPO-style preference training via DPOTrainer.
ModelForge/strategies/qlora_strategy.py Migrates QLoRA to TRL SFTTrainer/SFTConfig and PEFT config handoff.
ModelForge/strategies/dpo_strategy.py Updates DPO to pass LoRA via peft_config and uses modern TRL trainer args.
ModelForge/services/training_service.py Marks _peft_already_applied when Unsloth already wrapped the model.
ModelForge/routers/finetuning_router.py Adds /validate_dataset_path and removes dataset-upload-based flow.
ModelForge/app_old.py Removes legacy FastAPI app entrypoint.
ModelForge/app.py Makes CORS origins configurable via CORS_ORIGINS env var.
ModelForge/Frontend/build/static/js/main.9df15fbf.js.LICENSE.txt Updates packaged frontend build artifact.
ModelForge/Frontend/build/index.html Updates packaged frontend build artifact entrypoint hashes.
ModelForge/Frontend/build/asset-manifest.json Updates packaged frontend build artifact hashes/manifest.
Frontend/src/services/api.js Replaces dataset upload API call with dataset path validation call.
Frontend/src/pages/TechnicalDetailsPage.jsx Removes unused router import.
Frontend/src/pages/Loading.jsx Removes unused demo helper and minor request handling cleanup.
Frontend/src/pages/ListModels.jsx Removes unused router imports.
Frontend/src/pages/FinetuningSettingsPage.jsx Switches dataset selection UI from file upload to path input + validation call.
Frontend/src/components/Navbar.jsx Removes unused footer stub.
Frontend/src/App.js Removes unused imports/page reference.
Frontend/package-lock.json Dependency lockfile updates (including removal of MUI).
Files not reviewed (1)
  • Frontend/package-lock.json: Language not supported

Comment on lines +386 to +390
dataset_path = data.get("dataset_path", "").strip()
logger.info(f"Validating dataset path: {dataset_path}")

if not dataset_path:
raise HTTPException(status_code=400, detail="Dataset path cannot be empty.")
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataset_path = data.get(...).strip() will throw an AttributeError (500) if dataset_path is not a string (e.g., null/number). Consider using a small Pydantic body model (or explicit type check/coercion) so invalid payloads return a 400 with a clear error instead of crashing.

Copilot uses AI. Check for mistakes.
Comment on lines 392 to +421
try:
# Validate file type
if not json_file.filename.endswith(('.json', '.jsonl')):
# Normalize the path
dataset_path = os.path.abspath(dataset_path)

# Check file exists
if not os.path.isfile(dataset_path):
raise HTTPException(
status_code=400,
detail="Invalid file type. Only JSON and JSONL files are allowed."
detail=f"File not found: {dataset_path}"
)

# Read file content
file_content = await json_file.read()

# Generate unique filename
file_id = str(uuid.uuid4())
filename = f"{file_id}_{json_file.filename}"
# Validate file type
if not dataset_path.endswith(('.json', '.jsonl')):
raise HTTPException(
status_code=400,
detail="Invalid file type. Only .json and .jsonl files are supported."
)

# Save file
default_dirs = file_manager.return_default_dirs()
file_path = os.path.join(default_dirs["datasets"], filename)
file_manager.save_file(file_path, file_content)
# Get file size
file_size = os.path.getsize(dataset_path)
filename = os.path.basename(dataset_path)

logger.info(f"Dataset uploaded successfully: {file_path}")
logger.info(f"Dataset path validated: {dataset_path} ({file_size} bytes)")

return {
"success": True,
"file_path": file_path,
"file_path": dataset_path,
"filename": filename,
"message": "Dataset uploaded successfully",
"file_size": file_size,
"message": "Dataset path is valid",
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This endpoint accepts an arbitrary local path and returns its absolute path and existence/size. If the API is ever exposed beyond localhost (misconfig, container port mapping, etc.), this becomes an information disclosure primitive. Consider restricting validation to an allowed base directory (e.g., a configured datasets folder) and avoid echoing absolute paths back to the client.

Copilot uses AI. Check for mistakes.
type="text"
id="dataset_path"
name="dataset_path"
placeholder="C:\path\to\your\dataset.json"
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The placeholder string uses Windows-style backslashes, but in JS string literals sequences like \t become escape sequences (tab, etc.), so the placeholder will render incorrectly. Escape the backslashes (e.g., C:\\path\\to\\...) or use a forward-slash example.

Suggested change
placeholder="C:\path\to\your\dataset.json"
placeholder="C:\\path\\to\\your\\dataset.json"

Copilot uses AI. Check for mistakes.
gradient_accumulation_steps=config.get("gradient_accumulation_steps", 4),
# QLoRA uses paged optimizers for memory efficiency
optim=config.get("optim", "paged_adamw_32bit"),
save_steps=config.get("save_steps", 0),
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sets save_steps to 0 by default, but save_strategy is configured as "steps" below; that combination can raise a ValueError and prevent training when save_steps is omitted. Consider defaulting save_steps to a positive value or deriving save behavior from the incoming config consistently.

Suggested change
save_steps=config.get("save_steps", 0),
save_steps=config.get("save_steps", config.get("eval_steps", 100)),

Copilot uses AI. Check for mistakes.
Comment on lines 195 to 199
eval_strategy="steps" if eval_dataset else "no",
eval_steps=config.get("eval_steps", 100),
save_strategy="steps",
load_best_model_at_end=True if eval_dataset else False,
metric_for_best_model="eval_loss" if eval_dataset else None,
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save_strategy is hard-coded to "steps", but save_steps in this config is set to 0 (see earlier in the same SFTConfig(...) block). That combination typically fails argument validation (step interval must be > 0) and can prevent training from starting when the UI doesn't provide save_steps. Consider defaulting save_steps to a positive value when using step-based saving, or honoring a non-step save strategy when save_steps is unset.

Copilot uses AI. Check for mistakes.
@RETR0-OS RETR0-OS changed the title [WIP] Fix finetuning methods Fix finetuning methods Feb 15, 2026
@RETR0-OS RETR0-OS marked this pull request as ready for review February 15, 2026 09:35
@RETR0-OS RETR0-OS merged commit d9e6cda into main Feb 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant