Random-Order AR Image Generation

Baseline Autoregressive Image Generation on CIFAR-10 with Raster Order

This repository contains the implementation of Random-Order Autoregressive (AR) Image Generation on CIFAR-10.

The project establishes a baseline using standard raster-scan order generation and serves as a foundation for experimenting with alternative token generation orders (e.g., random permutations) to improve model calibration and robustness.

Project Overview

The pipeline follows a two-stage approach:

VQ-VAE Tokenizer: Compresses $32 \times 32$ images into an $8 \times 8$ discrete latent grid.
Autoregressive Transformer: Models the distribution of discrete tokens to generate new images.

While the baseline uses a fixed raster scan order (row-by-row), this codebase is designed to support research into randomized generation orders.

Pretrained Weights

Pretrained weights for the VQ-VAE tokenizer and the baseline RandAR model (raster order) are available at:

Google Drive: https://drive.google.com/drive/folders/1B528vJu1Icn1PtIwJVfmd39WPNqIEEtg?usp=sharing

This allows reproducing reported results without retraining the models.

Reproduction Guide

To reproduce the baseline experiment, please follow the steps below. The project uses uv for fast and reliable dependency management.

1. Prerequisites

Python: Version 3.12 or higher.
GPU: An NVIDIA GPU is recommended for training (tested on NVIDIA RTX GPUs).
uv: Ensure uv is installed on your system.

# Install uv (Linux/macOS)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install uv (Windows PowerShell)
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

2. Environment Setup

All dependencies are managed via pyproject.toml.

Step 2.1: Create environment and install dependencies Run the following command in the project root. This will create a virtual environment (.venv) and install PyTorch, Transformers, and other required libraries.

uv sync

Experimental Workflow

The experiment consists of five sequential steps. Run each script/notebook in the order listed below.

Step 1: Data Preparation

Download and prepare the CIFAR-10 dataset.

data/load_CIFAR10.py

Step 2: Train VQ-VAE Tokenizer

Train the vector-quantized autoencoder to learn the discrete codebook. Input: Raw images from data/ Output: Trained weights (tokenizer_vq/vqvae_cifar10.pth)

Open and run all cells in the Jupyter Notebook

uv run jupyter notebook tokenizer_vq/vq-vae.ipynb

Step 3: Extract Latent Codes

Encode the entire CIFAR-10 dataset into discrete token sequences using the trained VQ-VAE.

tools/extract_latent_codes.py

Step 4: Train Autoregressive Model

Train the decoder-only Transformer on the extracted token sequences.

train_c2i.py

Step 5: Evaluation

Evaluate the trained AR model by generating samples and computing metrics

eval_c2i.py

Original repo of RandAR

https://github.com/ziqipang/RandAR

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
RandAR		RandAR
configs		configs
data		data
imgs		imgs
tokenizer_vq		tokenizer_vq
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval_c2i.py		eval_c2i.py
eval_calibration_sample_level.py		eval_calibration_sample_level.py
eval_calibration_token_level.py		eval_calibration_token_level.py
print_metrics.py		print_metrics.py
pyproject.toml		pyproject.toml
run.sh		run.sh
run_hp_tuning.py		run_hp_tuning.py
sample_c2i.py		sample_c2i.py
search_cfg.sh		search_cfg.sh
train_c2i.py		train_c2i.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Random-Order AR Image Generation

Baseline Autoregressive Image Generation on CIFAR-10 with Raster Order

Project Overview

Pretrained Weights

Reproduction Guide

1. Prerequisites

2. Environment Setup

Experimental Workflow

Step 1: Data Preparation

Step 2: Train VQ-VAE Tokenizer

Step 3: Extract Latent Codes

Step 4: Train Autoregressive Model

Step 5: Evaluation

Original repo of RandAR

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Random-Order AR Image Generation

Baseline Autoregressive Image Generation on CIFAR-10 with Raster Order

Project Overview

Pretrained Weights

Reproduction Guide

1. Prerequisites

2. Environment Setup

Experimental Workflow

Step 1: Data Preparation

Step 2: Train VQ-VAE Tokenizer

Step 3: Extract Latent Codes

Step 4: Train Autoregressive Model

Step 5: Evaluation

Original repo of RandAR

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages