ComfyUI-PiD

ComfyUI custom nodes for using NVIDIA PiD as a pixel diffusion decoder.

PiD is not a normal ComfyUI VAE. It needs a latent, a prompt/caption, a sigma value, and optionally a native decoder baseline image:

LATENT + caption + sigma + optional baseline IMAGE -> PiD -> IMAGE

For the official latent-conditioned PiD checkpoints, this node can infer the baseline size from the latent and skip the extra VAE/baseline image path to reduce VRAM use.

Features

Direct PiD Decode node that returns a ComfyUI IMAGE.
Staged low-VRAM workflow: PiD Prepare → PiD Sample → PiD Finalize.
PiD Sample runs in a subprocess so CUDA memory is released after sampling.
PiD KSampler Capture for grabbing an intermediate latent and matching sigma.
Lazy setup: PiD source, checkpoints, and required assets are prepared on first run when auto_download=true.
Optional sequential block offload for lower VRAM at the cost of speed.

Install

Clone into ComfyUI/custom_nodes:

cd ComfyUI/custom_nodes
git clone https://github.com/Merserk/ComfyUI-PiD.git
cd ComfyUI-PiD
python -m pip install -r requirements.txt

Restart ComfyUI.

Requirements:

Python >=3.10
NVIDIA CUDA GPU
Working ComfyUI install
Enough VRAM for PiD, especially for 2kto4k or large output scales

requirements.txt does not install PyTorch because ComfyUI usually provides it.

Nodes

Node	Purpose
PiD Decode	One-node PiD decode from latent to image.
PiD Text Prompt	One prompt box with `text` for CLIP and `caption` for PiD.
PiD KSampler Capture	KSampler-compatible sampler that returns final latent, captured PiD latent, and sigma.
PiD Prepare	Prepares latent, caption, checkpoint, assets, and metadata on CPU.
PiD Sample	Runs the heavy PiD sampling step in a subprocess.
PiD Finalize	Converts sampled PiD output back to ComfyUI `IMAGE`.
PiD Decode (Staged)	Convenience wrapper around the staged path.

Supported backbones

Value	Backbone	Latent channels	Checkpoints
`zimage`	Z-Image / Flux-compatible	16	`2k`, `2kto4k`
`flux`	Flux	16	`2k`, `2kto4k`
`flux2`	Flux2	128	`2k`, `2kto4k`
`sd3`	Stable Diffusion 3	16	`2k`, `2kto4k`
`dinov2`	DINOv2 RAE	768	`2k`
`siglip`	SigLIP Scale-RAE	1152	`2k`

scale=0 uses NVIDIA's default scale for the selected checkpoint: usually 4x, or 8x for SigLIP Scale-RAE.

Basic workflow

For Z-Image / Flux-style workflows:

PiD Text Prompt text    -> CLIP Text Encode
PiD Text Prompt caption -> PiD Decode caption
KSampler latent         -> PiD Decode latent
PiD Decode image        -> Save Image

Recommended first test settings:

backbone = zimage
pid_ckpt_type = 2k
pid_steps = 4
scale = 1 or 2
cfg_scale = 1.0
sigma = 0.0
auto_download = true
unload_comfy_before_pid = true
aggressive_cleanup = true
sequential_offload = disabled

For official latent-conditioned checkpoints, leave vae and baseline_image disconnected unless you specifically need an external baseline size.

Lowest-VRAM staged workflow

Use the staged nodes when VRAM is tight:

PiD KSampler Capture pid_latent -> PiD Prepare latent
PiD Text Prompt caption         -> PiD Prepare caption
PiD Prepare                     -> PiD Sample
PiD Sample                      -> PiD Finalize
PiD Finalize image              -> Save Image

Recommended Z-Image capture settings:

steps = 50
sampler_name = euler
scheduler = beta
capture_step = 46

PiD Sample runs in a separate Python process, so its CUDA context is destroyed after the sample is finished.

Output size guide

512x512 base  + 2k     + scale 4 -> 2048x2048
1024x1024 base + 2kto4k + scale 4 -> 4096x4096

Large outputs can require a lot of VRAM. If a run fails, try:

Lower scale.
Use a smaller base latent.
Keep cleanup options enabled.
Try sequential_blocks, then sequential_blocks_aggressive.
Restart ComfyUI after CUDA allocator crashes.

PiD source and weights

By default, the node uses:

ComfyUI/custom_nodes/ComfyUI-PiD/vendor/PiD

You can override the PiD source location with:

the pid_source_dir node input
PID_REPO_DIR
COMFYUI_PID_REPO_DIR

When auto_download=true, the node downloads missing PiD source/checkpoints/assets as needed.

Example workflow

A template workflow is included in:

example_workflows/image_z_image_pid.json

After restart, open it from ComfyUI workflow templates or load the JSON manually.

Notes

This is a community wrapper around NVIDIA's public PiD code, not an official NVIDIA or ComfyUI project.
PiD outputs IMAGE, not a ComfyUI VAE.
NVIDIA's PiD weights may have separate license/usage terms. Check the model card before commercial use.
Final latents with sigma=0.0 can work, but captured intermediate latents usually better match the official PiD recipe.

License

This project is released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.github/workflows		.github/workflows
example_workflows		example_workflows
.comfyignore		.comfyignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
nodes.py		nodes.py
pid_capture_sampler.py		pid_capture_sampler.py
pid_decode.py		pid_decode.py
pid_staged_decode.py		pid_staged_decode.py
pid_subprocess_runner.py		pid_subprocess_runner.py
pid_text_prompt.py		pid_text_prompt.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ComfyUI-PiD

Features

Install

Nodes

Supported backbones

Basic workflow

Lowest-VRAM staged workflow

Output size guide

PiD source and weights

Example workflow

Notes

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ComfyUI-PiD

Features

Install

Nodes

Supported backbones

Basic workflow

Lowest-VRAM staged workflow

Output size guide

PiD source and weights

Example workflow

Notes

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages