Skip to content

ShanechiLab/qwen_finetune

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Qwen Finetune

Hydra-managed pipeline for preparing LIRIS videos, fine-tuning Qwen/Qwen2.5-Omni-7B with LoRA, and evaluating the resulting checkpoint on a held-out test split.

The code in this folder is the structured version of the original Finetune/finetune_faster.py and Finetune/finetune_inference.py scripts.

What It Does

  1. Prepares videos before every run.

    • Computes a deterministic transcode hash.
    • Uses /data1/alperozd/emo_label/videos_<hash>.
    • Skips transcoding when transcode.done already exists.
  2. Creates deterministic train/val/test metadata.

    • Split metadata is saved as dataset_split_metadata_<hash>.csv.
    • The hash inputs are saved next to it for inspection.
  3. Fine-tunes Qwen Omni with LoRA.

    • Applies LoRA to selected thinker language, visual merger, and audio projection modules.
    • Uses gradient checkpointing by default.
    • Uses manual distributed gradient averaging under torchrun.
  4. Uses a memory-saving supervised loss.

    • The model is trained only on the assistant answer token, e.g. Q1.
    • The code avoids full [sequence, vocab] logits.
    • It runs the Qwen thinker to hidden states, then projects only active answer positions through lm_head.
  5. Runs test inference by candidate loss.

    • Scores Q1, Q2, Q3, and Q4.
    • Picks the label with the lowest loss.
    • Reports accuracy and a confusion matrix.

Folder Layout

configs/
  config.yaml                 # top-level defaults and experiment name
  data/liris.yaml             # CSV paths, raw/prepared video paths, split ratios
  mode/train.yaml             # training hyperparameters and checkpoint paths
  mode/inference.yaml         # inference checkpoint/split/output settings
  model/qwen_omni_lora.yaml   # model, dtype, attention, LoRA settings
  runtime/*.yaml              # GPU/runtime/transcode settings

qwen_finetune/
  runner.py                   # prepares videos/splits, dispatches train/inference
  preprocessing/              # ffmpeg transcoding and transcode.done logic
  data/                       # sample loading, dataloaders, split metadata
  model/                      # Qwen prompt, input processing, LoRA, custom loss
  engine/                     # train loop, validation, inference, distributed utils
  utils/                      # hashing, CUDA flags, runtime logging, seed helpers

Important Configs

The default experiment is:

experiment:
  name: qwen_finetune
  root: /data1/alperozd/emo_label
  dir: ${experiment.root}/${experiment.name}

LoRA checkpoints are saved to:

/data1/alperozd/emo_label/qwen_finetune/qwen_lora_checkpoint_best
/data1/alperozd/emo_label/qwen_finetune/qwen_lora_checkpoint_last

Inference reads the best checkpoint by default:

inference:
  checkpoint_dir: ${experiment.dir}/qwen_lora_checkpoint_best

The hash policies are explicit in:

qwen_finetune/utils/hashing.py

Edit TRANSCODE_HASH_FIELDS or SPLIT_HASH_FIELDS there if the definition of dataset identity changes.

Setup

Run commands from this folder:

cd /home/alperozd/emo_label/qwen_finetune

Install the extra config dependency if needed:

pip install -r requirements.txt

Use the project environment:

/home/alperozd/miniconda3/envs/emo_label/bin/python

Common Runs

Dry-run transcode on two videos:

/home/alperozd/miniconda3/envs/emo_label/bin/python main.py \
  data.prepared_video_root=/tmp/qwen_finetune_smoke \
  transcode.limit=2 \
  transcode.dry_run=true \
  transcode.workers=1

Train with defaults:

/home/alperozd/miniconda3/envs/emo_label/bin/python main.py mode=train

Train on seven GPUs:

/home/alperozd/miniconda3/envs/emo_label/bin/python -m torch.distributed.run \
  --nproc_per_node=7 \
  main.py mode=train

Run test inference:

/home/alperozd/miniconda3/envs/emo_label/bin/python main.py mode=inference

Run test inference on seven GPUs:

/home/alperozd/miniconda3/envs/emo_label/bin/python -m torch.distributed.run \
  --nproc_per_node=7 \
  main.py mode=inference

Use an already-transcoded video folder:

/home/alperozd/miniconda3/envs/emo_label/bin/python main.py \
  mode=train \
  transcode.enabled=false \
  data.video_folder=/home/alperozd/DATA/LIRIS/data_normalized_2fps_640

Outputs

Hydra creates run folders under:

outputs/YYYY-MM-DD/HH-MM-SS/

The code tees normal printed status messages into main.log. Live tqdm progress is kept mostly in the terminal to avoid messy carriage-return logs.

Generated artifacts such as outputs/, splits/, checkpoints, and .safetensors files are ignored by .gitignore.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages