Capstone: Lightweight LLM Alignment for Rewriting and Tool Use

Intent of this repository

This repository is an experiment-first capstone project for aligning a small open model through staged training and evaluation.

It combines two related tracks:

Rewrite quality alignment

Start from base model behavior.
Train a LoRA adapter with supervised fine-tuning (SFT) on rewrite pairs.
Improve behavior further with preference optimization (DPO).

Tool-use benchmarking

Build a synthetic, single-turn tool-calling dataset.
Evaluate model outputs with strict JSON/tool/argument matching metrics.
Adapt raw generation logs into evaluator-compatible format.

In short, this repo is meant to answer:

How much can small-model behavior improve with simple SFT plus DPO?
How do we measure tool-call correctness reliably and reproducibly?

Project structure

Top-level numbered scripts (chronological workflow)

1.test_inference.py, 2.test_inference.py
- Early sanity checks for base-model text generation and chat-template behavior.
3.check_dataset.py
- Verifies rewrite train/eval JSONL files load correctly.
4.sft_lora.py
- Trains a LoRA adapter with TRL SFTTrainer on rewrite prompt-response pairs.
- Saves training checkpoints and final adapter.
5.compare_before_after.py
- Compares base model outputs vs SFT adapter outputs on eval prompts.
6.sample_and_score.py
- Samples multiple candidate rewrites and ranks them via a simple heuristic reward.
7.check_prefs.py
- Verifies preference datasets (prefs.jsonl, prefs_large.jsonl).
8.compare_pref_behavior.py
- Compares model generations against chosen/rejected preference examples.
9.dpo_lora.py
- Runs LoRA-based DPO training using preference pairs.
- Includes prompt-prefix consistency checks before training.
10.dpo_full_smoke_test.py
- Short full-parameter DPO smoke test to validate setup and catch OOM/config issues early.

Tool-use dataset and evaluation scripts

generate_tool_use_dataset.py
- Generates balanced tool-use examples for 5 tools: calculator, weather, time, search, reminder.
- Writes split files and docs under data/tool_use_dataset_v1/.
evaluate_tool_use.py
- Evaluates predictions against gold tool calls.
- Reports strict metrics such as:
  - valid JSON rate
  - tool exact match accuracy
  - argument exact match accuracy
  - strict success rate
  - per-tool breakdown and error buckets
adapt_predictions.py
- Converts diverse raw generation formats into evaluator-ready JSONL.

Notebooks

baseline_tool_use.ipynb, baseline_tool_use_v2.ipynb
- Interactive experimentation for tool-use baseline generation/evaluation.

Data directory

data/train.jsonl, data/eval.jsonl
- Supervised rewrite dataset used by SFT.
data/prefs.jsonl, data/prefs_large.jsonl
- Preference pairs (prompt, chosen, rejected) used by DPO and behavior checks.
data/tool_use_dataset_v1/
- Tool-use benchmark package:
  - raw/ generated corpus
  - processed/ split files
  - docs/ label + split guidance
  - tools_schema.json

Outputs directory

outputs/sft-run/, outputs/sft-final/
- SFT checkpoints and final LoRA adapter artifacts.
outputs/dpo-lora-run/, outputs/dpo-lora-final/
- DPO LoRA checkpoints and final adapter artifacts.
outputs/dpo-full-smoke/
- Smoke-test outputs for full-parameter DPO run.

Typical experiment flow

Run base sanity checks (1, 2, 3).
Train SFT adapter (4).
Compare base vs SFT (5) and inspect simple reward ranking (6).
Validate preference data (7) and inspect preference behavior (8).
Run DPO LoRA training (9).
Optionally run full DPO smoke test (10).
For tool-use experiments, generate dataset, adapt predictions, then evaluate.

Minimal environment expectations

The scripts assume a GPU-enabled Python environment with packages commonly used here:

torch
transformers
datasets
peft
trl

Several scripts use device_map="auto" and fp16 settings, so CUDA availability is expected for practical runtime.

Notes

This repo is script-centric and intentionally iterative; numbered files reflect the progression of the capstone work.
outputs/ can become large quickly because checkpoints and tokenizer/model artifacts are stored there.
The .gitignore is a general Python template and may need extension if you want to exclude large training artifacts from version control.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Capstone: Lightweight LLM Alignment for Rewriting and Tool Use

Intent of this repository

Project structure

Top-level numbered scripts (chronological workflow)

Tool-use dataset and evaluation scripts

Notebooks

Data directory

Outputs directory

Typical experiment flow

Minimal environment expectations

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
grpo_training		grpo_training
outputs		outputs
.gitignore		.gitignore
1.test_inference.py		1.test_inference.py
10.dpo_full_smoke_test.py		10.dpo_full_smoke_test.py
2.test_inference.py		2.test_inference.py
3.check_dataset.py		3.check_dataset.py
4.sft_lora.py		4.sft_lora.py
5.compare_before_after.py		5.compare_before_after.py
6.sample_and_score.py		6.sample_and_score.py
7.check_prefs.py		7.check_prefs.py
8.compare_pref_behavior.py		8.compare_pref_behavior.py
9.dpo_lora.py		9.dpo_lora.py
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Capstone: Lightweight LLM Alignment for Rewriting and Tool Use

Intent of this repository

Project structure

Top-level numbered scripts (chronological workflow)

Tool-use dataset and evaluation scripts

Notebooks

Data directory

Outputs directory

Typical experiment flow

Minimal environment expectations

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages