Fix lumina2 pad token shape mismatch for some GGUF conversions by vaclavmuller · Pull Request #392 · city96/ComfyUI-GGUF

vaclavmuller · 2025-12-24T16:04:50Z

This PR fixes a shape mismatch when loading some lumina2 / NextDiT GGUF models
(e.g. Z-Image Turbo GGUF builds).

Some GGUF conversions store x_pad_token and cap_pad_token as 1D vectors
([D]) instead of the expected 2D shape ([1, D]), which causes
load_state_dict to fail.

The loader now:

ensures a robust fallback shape when orig_shape metadata is missing
reshapes lumina2 pad tokens to (1, D) when needed

Tested with:
https://huggingface.co/leejet/Z-Image-Turbo-GGUF

Addresses #379

altarhex · 2026-01-04T23:30:08Z

thank you! this fixed the issue I was having

RubenGarcia · 2026-01-15T20:43:47Z

I'd also like this merged.

hinablue · 2026-02-06T09:01:50Z

Thank you! That solved my problem.

RubenGarcia · 2026-02-09T19:49:41Z

@city96 Is there anything else you need done here?

qskousen · 2026-02-21T13:55:27Z

I don't pretend to understand all of what is going on here, but I've found that this only causes an error if these two layers are in BF16; if they are F32 or F16 they seem to work fine. Not sure why that is. Or, I may be experiencing an unrelated issue.

patientx · 2026-02-24T19:01:48Z

this should be on the main.

RubenGarcia · 2026-03-05T20:12:10Z

While this fixes loading for zimage turbo, I am getting similar errors for zimage base.
@vaclavmuller, can you have a look?
These are the affected models:
https://huggingface.co/babakarto/z-image-base-gguf/resolve/main/z_image_base_BF16.gguf
https://huggingface.co/babakarto/z-image-base-gguf/resolve/main/z_image_base_Q8_0.gguf

z_image_base_Q8_0.gguf:
The expanded size of the tensor (3840) must match the existing size (4080) at non-singleton dimension 1. Target sizes: [3840, 3840]. Tensor sizes: [3840, 4080]

z_image_base_BF16.gguf
The expanded size of the tensor (3840) must match the existing size (7680) at non-singleton dimension 1. Target sizes: [3840, 3840]. Tensor sizes: [3840, 7680]

82sound-jpg · 2026-03-12T17:52:51Z

While this fixes loading for zimage turbo, I am getting similar errors for zimage base. @vaclavmuller, can you have a look? These are the affected models: https://huggingface.co/babakarto/z-image-base-gguf/resolve/main/z_image_base_BF16.gguf https://huggingface.co/babakarto/z-image-base-gguf/resolve/main/z_image_base_Q8_0.gguf

z_image_base_Q8_0.gguf: The expanded size of the tensor (3840) must match the existing size (4080) at non-singleton dimension 1. Target sizes: [3840, 3840]. Tensor sizes: [3840, 4080]

z_image_base_BF16.gguf The expanded size of the tensor (3840) must match the existing size (7680) at non-singleton dimension 1. Target sizes: [3840, 3840]. Tensor sizes: [3840, 7680]

我也是这个问题

vaclavmuller · 2026-03-13T07:34:32Z

Thanks for testing and for reporting this.

The fix in this PR only addresses the pad token shape mismatch that occurs with some Z-Image Turbo GGUF conversions (x_pad_token / cap_pad_token stored as [D] instead of [1, D]). From the error messages you posted, the issue with the Z-Image Base models looks different.

These errors:

Q8_0:
The expanded size of the tensor (3840) must match the existing size (4080) at non-singleton dimension 1

BF16:
The expanded size of the tensor (3840) must match the existing size (7680) at non-singleton dimension 1

suggest that the tensor shapes stored in the GGUF file do not match what the current NextDiT / lumina2 implementation in ComfyUI expects. That is most likely related either to:

differences in the GGUF conversion script used for those models, or
missing / incorrect orig_shape metadata in the GGUF file.

In other words, this is probably not the same issue that this PR fixes.

For transparency: I am not a core developer of this repository. I ran into the Turbo issue while using ComfyUI and worked through the debugging process with the help of ChatGPT, which helped me understand the loader code and produce a minimal fix for that specific problem.

If you want to investigate the Z-Image Base issue further, I would recommend:

checking the tensor names and shapes inside the GGUF file,
comparing them with the shapes expected by the lumina2 / NextDiT model in ComfyUI,
verifying whether the conversion process added correct comfy.gguf.orig_shape.* metadata.

If you are not familiar with the loader code, tools like ChatGPT can actually be very helpful for exploring the code and reasoning about shape mismatches like this. That’s essentially how I approached the Turbo issue as well.

If someone can identify which specific tensor is triggering the mismatch (the stack trace usually shows it), that would be a good starting point to determine whether this needs:

a loader-side workaround similar to the Turbo fix, or
a correction in the GGUF conversion process.

The errors/bugs fixed 1. NextDiT / Lumina Pad Token Shape Mismatch Python RuntimeError: Error(s) in loading state_dict for NextDiT: size mismatch for x_pad_token: copying a param with shape torch.Size([3840]) from checkpoint, the shape in current model is torch.Size([1, 3840]). size mismatch for cap_pad_token: copying a param with shape torch.Size([3840]) from checkpoint, the shape in current model is torch.Size([1, 3840]). The Cause: A recent update to the ComfyUI core changed the expected shape of NextDiT/Lumina pad tokens from 1D [3840] to 2D [1, 3840]. 2. BF16 Raw Byte Mismatch Python RuntimeError: Error(s) in loading state_dict for NextDiT: While copying the parameter named "x_pad_token", whose dimensions in the model are torch.Size([1, 3840]) and whose dimensions in the checkpoint are torch.Size([7680]), an exception occurred : ('The size of tensor a (3840) must match the size of tensor b (7680) at non-singleton dimension 1',) The Cause: Models packed with uncompressed bfloat16 tokens load as raw bytes (3840×2=7680 bytes). Because the PR city96#392 patch forced these tokens to 2D before the BF16 dequantization step, it bypassed the len(shape) <= 1 safety check. The uncompressed bytes were passed directly to ComfyUI without being converted back to floats. 🛠️ The Fix Modified loader.py to ensure that Lumina pad tokens (x_pad_token, cap_pad_token) are properly evaluated by the bfloat16 dequantization check, even when they are forced into a 2D shape. Replaced the lumina2 shape logic in loader.py with: Python is_lumina_pad = (arch_str == "lumina2" and sd_key in ("x_pad_token", "cap_pad_token")) if is_lumina_pad: if len(shape) == 1: shape = torch.Size((1, shape[0])) # add to state dict if tensor.tensor_type in {gguf.GGMLQuantizationType.F32, gguf.GGMLQuantizationType.F16}: torch_tensor = torch_tensor.view(*shape) state_dict[sd_key] = GGMLTensor(torch_tensor, tensor_type=tensor.tensor_type, tensor_shape=shape) # 1D tensors shouldn't be quantized, this is a fix for BF16 # Force the fix to run on lumina pad tokens as well if (len(shape) <= 1 or is_lumina_pad) and tensor.tensor_type == gguf.GGMLQuantizationType.BF16: state_dict[sd_key] = dequantize_tensor(state_dict[sd_key], dtype=torch.float32) This ensures that the 7680 bytes are converted back into 3840 true float values before being handed over to ComfyUI's new [1, 3840] structure.

Fix lumina2 pad token shape mismatch

f10f6a7

RubenGarcia approved these changes Jan 15, 2026

View reviewed changes

ItsG33k approved these changes Feb 2, 2026

View reviewed changes

makisekurisu-jp mentioned this pull request Mar 3, 2026

Error(s) in loading state_dict for NextDiT #417

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix lumina2 pad token shape mismatch for some GGUF conversions#392

Fix lumina2 pad token shape mismatch for some GGUF conversions#392
vaclavmuller wants to merge 1 commit into
city96:mainfrom
vaclavmuller:fix-lumina2-pad-token-shape

vaclavmuller commented Dec 24, 2025

Uh oh!

altarhex commented Jan 4, 2026

Uh oh!

RubenGarcia commented Jan 15, 2026

Uh oh!

hinablue commented Feb 6, 2026

Uh oh!

RubenGarcia commented Feb 9, 2026

Uh oh!

qskousen commented Feb 21, 2026 •

edited

Loading

Uh oh!

patientx commented Feb 24, 2026

Uh oh!

RubenGarcia commented Mar 5, 2026 •

edited

Loading

Uh oh!

82sound-jpg commented Mar 12, 2026

Uh oh!

vaclavmuller commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

vaclavmuller commented Dec 24, 2025

Uh oh!

altarhex commented Jan 4, 2026

Uh oh!

RubenGarcia commented Jan 15, 2026

Uh oh!

hinablue commented Feb 6, 2026

Uh oh!

RubenGarcia commented Feb 9, 2026

Uh oh!

qskousen commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patientx commented Feb 24, 2026

Uh oh!

RubenGarcia commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

82sound-jpg commented Mar 12, 2026

Uh oh!

vaclavmuller commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

qskousen commented Feb 21, 2026 •

edited

Loading

RubenGarcia commented Mar 5, 2026 •

edited

Loading