Fix finetuning methods by RETR0-OS · Pull Request #56 · ForgeOpus/ModelForge

RETR0-OS · 2026-02-11T05:31:18Z

Fixes DPO, PPO, LoRA and QLora streategies

The SFT strategy was using raw transformers.Trainer with manual tokenization and data collation, duplicating what SFTTrainer does automatically. Replace with TRL's SFTTrainer which handles tokenization, label creation, and data collation natively from a text field. Also use SFTConfig instead of TrainingArguments, and pass peft_config to let SFTTrainer manage LoRA wrapping. Uses processing_class parameter instead of deprecated tokenizer.

QLoRA strategy was duplicating SFT's broken manual tokenization approach. Replace with TRL SFTTrainer + SFTConfig just like SFT, but with QLoRA-optimized defaults: higher LoRA rank (64), bf16, gradient checkpointing, and paged optimizer. SFTTrainer handles tokenization and data collation automatically.

Replace deprecated tokenizer parameter with processing_class, deprecated evaluation_strategy with eval_strategy. Add peft_config parameter to let DPOTrainer handle LoRA wrapping instead of manual get_peft_model calls. Extract get_peft_config method for consistency with other strategies.

The RLHF strategy used the old PPOTrainer API (PPOConfig with model_name, PPOTrainer with config= parameter) which no longer exists in TRL 0.16+. PPO also requires a separate reward model which users of a no-code tool won't have. Replace with DPO (Direct Preference Optimization) which achieves equivalent alignment results without a reward model. Uses the same dataset format (prompt/chosen/rejected) and modern TRL API with processing_class and peft_config parameters.

When using Unsloth, provider.prepare_for_training() applies LoRA via FastLanguageModel.get_peft_model(). The strategies also pass peft_config to SFTTrainer/DPOTrainer, which would try to wrap the model again. Add _peft_already_applied flag to config when Unsloth prepares the model, and skip peft_config in create_trainer when this flag is set.

ModelForge runs entirely locally, so asking users to "upload" a dataset makes no sense - the file already exists on their system. Replace the /load_settings POST endpoint (which accepted multipart file upload) with /validate_dataset_path (which accepts a local file path and validates it exists and is JSON/JSONL). Remove unused imports for UploadFile, File, Form, uuid, FileManager.

Since ModelForge runs locally, replace the "Upload Dataset" file picker with a text input where users enter the path to their local dataset file. The frontend now calls validateDatasetPath instead of uploadDataset, sending the path as JSON to the new backend endpoint. This avoids unnecessary file copying.

Copilot

Pull request overview

This PR updates ModelForge’s fine-tuning pipeline to use TRL’s newer trainer/config APIs (SFT/DPO) and adjusts the UI + API flow to validate local dataset paths instead of uploading datasets.

Changes:

Refactors SFT/QLoRA/DPO/RLHF strategies to rely on TRL trainers with SFTConfig/DPOConfig and pass LoRA via peft_config (avoiding double-wrapping with Unsloth).
Replaces dataset upload with a local-path validation endpoint and updates the frontend to submit dataset paths.
Packages the built frontend assets and updates backend CORS configuration.

Reviewed changes

Copilot reviewed 18 out of 23 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
pyproject.toml	Includes packaged frontend build assets in the Python distribution.
ModelForge/strategies/sft_strategy.py	Migrates SFT to TRL `SFTTrainer`/`SFTConfig` and PEFT config handoff.
ModelForge/strategies/rlhf_strategy.py	Reworks RLHF flow to use DPO-style preference training via `DPOTrainer`.
ModelForge/strategies/qlora_strategy.py	Migrates QLoRA to TRL `SFTTrainer`/`SFTConfig` and PEFT config handoff.
ModelForge/strategies/dpo_strategy.py	Updates DPO to pass LoRA via `peft_config` and uses modern TRL trainer args.
ModelForge/services/training_service.py	Marks `_peft_already_applied` when Unsloth already wrapped the model.
ModelForge/routers/finetuning_router.py	Adds `/validate_dataset_path` and removes dataset-upload-based flow.
ModelForge/app_old.py	Removes legacy FastAPI app entrypoint.
ModelForge/app.py	Makes CORS origins configurable via `CORS_ORIGINS` env var.
ModelForge/Frontend/build/static/js/main.9df15fbf.js.LICENSE.txt	Updates packaged frontend build artifact.
ModelForge/Frontend/build/index.html	Updates packaged frontend build artifact entrypoint hashes.
ModelForge/Frontend/build/asset-manifest.json	Updates packaged frontend build artifact hashes/manifest.
Frontend/src/services/api.js	Replaces dataset upload API call with dataset path validation call.
Frontend/src/pages/TechnicalDetailsPage.jsx	Removes unused router import.
Frontend/src/pages/Loading.jsx	Removes unused demo helper and minor request handling cleanup.
Frontend/src/pages/ListModels.jsx	Removes unused router imports.
Frontend/src/pages/FinetuningSettingsPage.jsx	Switches dataset selection UI from file upload to path input + validation call.
Frontend/src/components/Navbar.jsx	Removes unused footer stub.
Frontend/src/App.js	Removes unused imports/page reference.
Frontend/package-lock.json	Dependency lockfile updates (including removal of MUI).

Files not reviewed (1)

Frontend/package-lock.json: Language not supported

Copilot · 2026-02-11T05:36:52Z

ModelForge/routers/finetuning_router.py

+    dataset_path = data.get("dataset_path", "").strip()
+    logger.info(f"Validating dataset path: {dataset_path}")
+
+    if not dataset_path:
+        raise HTTPException(status_code=400, detail="Dataset path cannot be empty.")


dataset_path = data.get(...).strip() will throw an AttributeError (500) if dataset_path is not a string (e.g., null/number). Consider using a small Pydantic body model (or explicit type check/coercion) so invalid payloads return a 400 with a clear error instead of crashing.

Copilot · 2026-02-11T05:36:52Z

ModelForge/routers/finetuning_router.py

    try:
-        # Validate file type
-        if not json_file.filename.endswith(('.json', '.jsonl')):
+        # Normalize the path
+        dataset_path = os.path.abspath(dataset_path)
+
+        # Check file exists
+        if not os.path.isfile(dataset_path):
            raise HTTPException(
                status_code=400,
-                detail="Invalid file type. Only JSON and JSONL files are allowed."
+                detail=f"File not found: {dataset_path}"
            )

-        # Read file content
-        file_content = await json_file.read()
-
-        # Generate unique filename
-        file_id = str(uuid.uuid4())
-        filename = f"{file_id}_{json_file.filename}"
+        # Validate file type
+        if not dataset_path.endswith(('.json', '.jsonl')):
+            raise HTTPException(
+                status_code=400,
+                detail="Invalid file type. Only .json and .jsonl files are supported."
+            )

-        # Save file
-        default_dirs = file_manager.return_default_dirs()
-        file_path = os.path.join(default_dirs["datasets"], filename)
-        file_manager.save_file(file_path, file_content)
+        # Get file size
+        file_size = os.path.getsize(dataset_path)
+        filename = os.path.basename(dataset_path)

-        logger.info(f"Dataset uploaded successfully: {file_path}")
+        logger.info(f"Dataset path validated: {dataset_path} ({file_size} bytes)")

        return {
            "success": True,
-            "file_path": file_path,
+            "file_path": dataset_path,
            "filename": filename,
-            "message": "Dataset uploaded successfully",
+            "file_size": file_size,
+            "message": "Dataset path is valid",


This endpoint accepts an arbitrary local path and returns its absolute path and existence/size. If the API is ever exposed beyond localhost (misconfig, container port mapping, etc.), this becomes an information disclosure primitive. Consider restricting validation to an allowed base directory (e.g., a configured datasets folder) and avoid echoing absolute paths back to the client.

Copilot · 2026-02-11T05:36:53Z

Frontend/src/pages/FinetuningSettingsPage.jsx

+                type="text"
+                id="dataset_path"
+                name="dataset_path"
+                placeholder="C:\path\to\your\dataset.json"


The placeholder string uses Windows-style backslashes, but in JS string literals sequences like \t become escape sequences (tab, etc.), so the placeholder will render incorrectly. Escape the backslashes (e.g., C:\\path\\to\\...) or use a forward-slash example.

Suggested change

placeholder="C:\path\to\your\dataset.json"

placeholder="C:\\path\\to\\your\\dataset.json"

Copilot · 2026-02-11T05:36:53Z

ModelForge/strategies/qlora_strategy.py

            gradient_accumulation_steps=config.get("gradient_accumulation_steps", 4),
            # QLoRA uses paged optimizers for memory efficiency
            optim=config.get("optim", "paged_adamw_32bit"),
            save_steps=config.get("save_steps", 0),


This sets save_steps to 0 by default, but save_strategy is configured as "steps" below; that combination can raise a ValueError and prevent training when save_steps is omitted. Consider defaulting save_steps to a positive value or deriving save behavior from the incoming config consistently.

Suggested change

save_steps=config.get("save_steps", 0),

save_steps=config.get("save_steps", config.get("eval_steps", 100)),

Copilot · 2026-02-11T05:36:53Z

ModelForge/strategies/sft_strategy.py

            eval_strategy="steps" if eval_dataset else "no",
            eval_steps=config.get("eval_steps", 100),
            save_strategy="steps",
            load_best_model_at_end=True if eval_dataset else False,
            metric_for_best_model="eval_loss" if eval_dataset else None,


save_strategy is hard-coded to "steps", but save_steps in this config is set to 0 (see earlier in the same SFTConfig(...) block). That combination typically fails argument validation (step interval must be > 0) and can prevent training from starting when the UI doesn't provide save_steps. Consider defaulting save_steps to a positive value when using step-based saving, or honoring a non-step save strategy when save_steps is unset.

RETR0-OS added 9 commits February 9, 2026 08:44

Fix frontend + CORS config

92aa540

fix missing frontend folder

65561dc

Copilot AI review requested due to automatic review settings February 11, 2026 05:31

RETR0-OS changed the title ~~[WIPFix finetuning methods~~ [WIP] Fix finetuning methods Feb 11, 2026

RETR0-OS marked this pull request as draft February 11, 2026 05:31

Copilot started reviewing on behalf of RETR0-OS February 11, 2026 05:31 View session

Copilot AI reviewed Feb 11, 2026

View reviewed changes

update package version

fe55a1f

RETR0-OS changed the title ~~[WIP] Fix finetuning methods~~ Fix finetuning methods Feb 15, 2026

RETR0-OS marked this pull request as ready for review February 15, 2026 09:35

RETR0-OS merged commit d9e6cda into main Feb 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix finetuning methods#56

Fix finetuning methods#56
RETR0-OS merged 10 commits intomainfrom
fix-finetuning-methods

RETR0-OS commented Feb 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	placeholder="C:\path\to\your\dataset.json"
	placeholder="C:\\path\\to\\your\\dataset.json"

	save_steps=config.get("save_steps", 0),
	save_steps=config.get("save_steps", config.get("eval_steps", 100)),

Conversation

RETR0-OS commented Feb 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant