Skip to content

feat(qwen_image): optimize VAE memory usage via dynamic offloading#668

Open
avan06 wants to merge 1 commit intoostris:mainfrom
avan06:qwen_image_dynamic-vae-offloading
Open

feat(qwen_image): optimize VAE memory usage via dynamic offloading#668
avan06 wants to merge 1 commit intoostris:mainfrom
avan06:qwen_image_dynamic-vae-offloading

Conversation

@avan06
Copy link

@avan06 avan06 commented Jan 24, 2026

This commit introduces memory optimization to the encode_images method in qwen_image.py.

Changes:

  • Dynamic Device Management: The VAE is now temporarily moved to the target device for encoding and automatically offloaded back to the CPU if low_vram is enabled or if the model was originally on the CPU.
  • Memory Optimization: Added torch.cuda.empty_cache() after offloading to ensure GPU memory is released immediately.
  • Resource Efficiency: Wrapped the encoding process in torch.no_grad() to prevent unnecessary gradient computation and reduce memory footprint.

The VAE offloading mechanism is currently enabled only when low_vram is active or the VAE is located on the CPU. This restriction minimizes the impact of the changes. These modifications have been tested and verified to effectively reduce VRAM usage during training.

@avan06 avan06 force-pushed the qwen_image_dynamic-vae-offloading branch from 85732be to 60e5fa9 Compare February 7, 2026 05:34
This commit introduces memory optimization to the encode_images method in qwen_image.py.

Changes:
- Dynamic Device Management: The VAE is now temporarily moved to the target device for encoding and automatically offloaded back to the CPU if low_vram is enabled or if the model was originally on the CPU.
- Memory Optimization: Added torch.cuda.empty_cache() after offloading to ensure GPU memory is released immediately.
- Resource Efficiency: Wrapped the encoding process in torch.no_grad() to prevent unnecessary gradient computation and reduce memory footprint.
@avan06 avan06 force-pushed the qwen_image_dynamic-vae-offloading branch from 60e5fa9 to c32220f Compare February 27, 2026 02:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant