Affine Transformations (FlatQuant) for Diffusion-Based Transformer Models

In this work, we apply invertible affine transformations before performing post-training quantization to reduce outliers and quantization error for diffusion models. This approach is inspired by FlatQuant, which applies affine transformations for language models. However, we optimize this idea specifically for transformer-based diffusion models.

Demo

To run the demo (inference) for this project, use the following command:

python ./main.py \
    --model ./modelzoo/pixart-sigma/PixArt-Sigma-XL-2-1024-MS \
    --w_bits 8 --a_bits 8 \
    --k_bits 8 --k_asym --k_groupsize 128 \
    --v_bits 8 --v_asym --v_groupsize 128 \
    --cali_dataset coco \
    --nsamples 4 --cali_timesteps 10 \
    --cali_bsz 4 --flat_lr 5e-3 \
    --lwc --lac --cali_trans --add_diag \
    --output_dir ./outputs --resume --reload_matrix \
    --prompt "[YOUR PROMPT HERE]"

Parameters:

w_bits, a_bits, k_bits, v_bits: Quantization levels for weights, activations, keys, and values. --prompt: Set your text-prompt (default: "A beautiful world"). Recommended Settings: W8A8 or W6A6 (adjust K, V as needed). The generated image will be saved as ./demo_image.png.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.ipynb_checkpoints		.ipynb_checkpoints
benchmarks		benchmarks
datasets		datasets
deploy		deploy
figures		figures
flatquant		flatquant
scripts		scripts
third-party		third-party
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
Final Report.pdf		Final Report.pdf
LICENSE		LICENSE
README.md		README.md
baseline_only.py		baseline_only.py
debugging.ipynb		debugging.ipynb
download_mjhq.py		download_mjhq.py
evals.py		evals.py
gptq_utils.py		gptq_utils.py
main.py		main.py
plot_flatness.py		plot_flatness.py
requirements.txt		requirements.txt
run_metrics.py		run_metrics.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Affine Transformations (FlatQuant) for Diffusion-Based Transformer Models

Demo

Parameters:

About

Uh oh!

Releases

Packages

Languages

License

ag2718/fq-diffusion

Folders and files

Latest commit

History

Repository files navigation

Affine Transformations (FlatQuant) for Diffusion-Based Transformer Models

Demo

Parameters:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages