OOM error after commit 0a2dd86

### Custom Node Testing

- [ ] I have tried disabling custom nodes and the issue persists (see [how to disable custom nodes](https://docs.comfy.org/troubleshooting/custom-node-issues#step-1%3A-test-with-all-custom-nodes-disabled) if you need help)

### Expected Behavior

LTX 2.3 workflow was working perfectly fine until 04879a8113961cbc4e2ff20e9feeb737ba703f51


### Actual Behavior

After updating (to the latest version) I started having persistent memory errors. After testing, I discovered that the errors started with commit 0a2dd86e782dadfad43e4b995c12d1901ce48823. Reverting to version 04879a8113961cbc4e2ff20e9feeb737ba703f51. The workflow is working again. **The OOM error occurs after the first sampler step.**

<img width="2174" height="1709" alt="Image" src="https://github.com/user-attachments/assets/b2eef08f-b150-4160-84dc-dc4c3649b5cf" />

### Steps to Reproduce

I believe it's very hardware dependent so no way to reproduce.

### Debug Logs

```powershell
[ERROR] !!! Exception during processing !!! CUDA error: out of memory
Search for `cudaErrorMemoryAllocation' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

[ERROR] Traceback (most recent call last):
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 536, in execute
    output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 336, in get_output_data
    return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 310, in _async_map_node_over_list
    await process_inputs(input_dict, i)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 298, in process_inputs
    result = f(**inputs)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy_api\internal\__init__.py", line 149, in wrapped_func
    return method(locked_class, **inputs)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy_api\latest\_io.py", line 1833, in EXECUTE_NORMALIZED
    to_return = cls.execute(*args, **kwargs)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy_extras\nodes_custom_sampler.py", line 967, in execute
    samples = guider.sample(noise.generate_noise(latent), latent_image, sampler, sigmas, denoise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=noise.seed)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1319, in sample
    output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 115, in execute
    return self.wrappers[self.idx](self, *args, **kwargs)
           ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui-kjnodes\nodes\ltxv_nodes.py", line 916, in __call__
    out = executor(noise, latent_image, sampler, sigmas, denoise_mask, combined_callback, disable_pbar, seed, latent_shapes=latent_shapes)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 107, in __call__
    return new_executor.execute(*args, **kwargs)
           ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
    return self.original(*args, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1257, in outer_sample
    output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1232, in inner_sample
    samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
    return self.original(*args, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-TiledDiffusion\utils.py", line 34, in KSAMPLER_sample
    return orig_fn(*args, **kwargs)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1002, in sample
    samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils\_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\k_diffusion\sampling.py", line 218, in sample_euler_ancestral
    return sample_euler_ancestral_RF(model, x, sigmas, extra_args, callback, disable, eta, s_noise, noise_sampler)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils\_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\k_diffusion\sampling.py", line 248, in sample_euler_ancestral_RF
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 642, in __call__
    out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1205, in __call__
    return self.outer_predict_noise(*args, **kwargs)
           ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1212, in outer_predict_noise
    ).execute(x, timestep, model_options, seed)
      ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
    return self.original(*args, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1215, in predict_noise
    return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 622, in sampling_function
    out = calc_cond_batch(model, conds, x, timestep, model_options)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 210, in calc_cond_batch
    return _calc_cond_batch_outer(model, conds, x_in, timestep, model_options)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 218, in _calc_cond_batch_outer
    return executor.execute(model, conds, x_in, timestep, model_options)
           ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
    return self.original(*args, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 334, in _calc_cond_batch
    output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
             ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_base.py", line 182, in apply_model
    return comfy.patcher_extension.WrapperExecutor.new_class_executor(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ...<2 lines>...
        comfy.patcher_extension.get_all_wrappers(comfy.patcher_extension.WrappersMP.APPLY_MODEL, transformer_options)
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ).execute(x, t, c_concat, c_crossattn, control, transformer_options, **kwargs)
    ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
    return self.original(*args, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_base.py", line 226, in _apply_model
    model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\av_model.py", line 1069, in forward
    return super().forward(
           ~~~~~~~~~~~~~~~^
        x,
        ^^
    ...<6 lines>...
        **kwargs,
        ^^^^^^^^^
    )
    ^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\model.py", line 936, in forward
    return comfy.patcher_extension.WrapperExecutor.new_class_executor(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ...<4 lines>...
        ),
        ~~
    ).execute(x, timestep, context, attention_mask, frame_rate, transformer_options, keyframe_idxs, denoise_mask=denoise_mask, **kwargs)
    ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
    return self.original(*args, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\model.py", line 989, in _forward
    x = self._process_transformer_blocks(
        x, context, attention_mask, timestep, pe,
    ...<2 lines>...
        **merged_args,
    )
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\av_model.py", line 917, in _process_transformer_blocks
    comfy.model_prefetch.prefetch_queue_pop(prefetch_queue, vx.device, block)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_prefetch.py", line 53, in prefetch_queue_pop
    offload_stream = comfy.ops.cast_modules_with_vbar(comfy_modules, None, device, None, True)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 214, in cast_modules_with_vbar
    handle_pin(s, pin, xfer_source, xfer_dest, size=dest_size)
    ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 202, in handle_pin
    comfy.model_management.cast_to_gathered(source, pin, non_blocking=non_blocking, stream=offload_stream, r2=dest)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 1447, in cast_to_gathered
    if comfy.memory_management.read_tensor_file_slice_into(tensor, dest_view, stream=stream, destination2=dest2_view):
       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\memory_management.py", line 59, in read_tensor_file_slice_into
    hostbuf.read_file_slice(file_obj, info.offset, info.size,
    ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                            offset=destination.data_ptr() - hostbuf.get_raw_address(),
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                            stream=stream_ptr,
                            ^^^^^^^^^^^^^^^^^^
                            device_ptr=device_ptr,
                            ^^^^^^^^^^^^^^^^^^^^^^
                            device=None if destination2 is None else destination2.device.index)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\comfy_aimdo\host_buffer.py", line 85, in read_file_slice
    raise RuntimeError("HostBuffer.read_file_slice failed")
RuntimeError: HostBuffer.read_file_slice failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 541, in execute
    comfy.model_management.reset_cast_buffers()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 1366, in reset_cast_buffers
    offload_stream.synchronize()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\cuda\streams.py", line 102, in synchronize
    super().synchronize()
    ~~~~~~~~~~~~~~~~~~~^^
torch.AcceleratorError: CUDA error: out of memory
Search for `cudaErrorMemoryAllocation' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.


[INFO] Memory summary:
|===========================================================================|
|                  PyTorch CUDA memory summary, device ID 0                 |
|---------------------------------------------------------------------------|
|            CUDA OOMs: 0            |        cudaMalloc retries: 0         |
|===========================================================================|
|        Metric         | Cur Usage  | Peak Usage | Tot Alloc  | Tot Freed  |
|---------------------------------------------------------------------------|
| Allocated memory      | 331278 KiB |   1776 MiB |      0 B   |      0 B   |
|       from large pool |      0 KiB |      0 MiB |      0 B   |      0 B   |
|       from small pool |      0 KiB |      0 MiB |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| Active memory         | 331278 KiB |   1776 MiB |      0 B   |      0 B   |
|       from large pool |      0 KiB |      0 MiB |      0 B   |      0 B   |
|       from small pool |      0 KiB |      0 MiB |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| Requested memory      |      0 B   |      0 B   |      0 B   |      0 B   |
|       from large pool |      0 B   |      0 B   |      0 B   |      0 B   |
|       from small pool |      0 B   |      0 B   |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| GPU reserved memory   |   1536 MiB |   2720 MiB |      0 B   |      0 B   |
|       from large pool |      0 MiB |      0 MiB |      0 B   |      0 B   |
|       from small pool |      0 MiB |      0 MiB |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| Non-releasable memory |      0 B   |      0 B   |      0 B   |      0 B   |
|       from large pool |      0 B   |      0 B   |      0 B   |      0 B   |
|       from small pool |      0 B   |      0 B   |      0 B   |      0 B   |
|---------------------------------------------------------------------------|
| Allocations           |       0    |       0    |       0    |       0    |
|       from large pool |       0    |       0    |       0    |       0    |
|       from small pool |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| Active allocs         |       0    |       0    |       0    |       0    |
|       from large pool |       0    |       0    |       0    |       0    |
|       from small pool |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| GPU reserved segments |       0    |       0    |       0    |       0    |
|       from large pool |       0    |       0    |       0    |       0    |
|       from small pool |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| Non-releasable allocs |       0    |       0    |       0    |       0    |
|       from large pool |       0    |       0    |       0    |       0    |
|       from small pool |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| Oversize allocations  |       0    |       0    |       0    |       0    |
|---------------------------------------------------------------------------|
| Oversize GPU segments |       0    |       0    |       0    |       0    |
|===========================================================================|

[ERROR] Got an OOM, unloading all loaded models.
Exception in thread Thread-20 (prompt_worker):
Traceback (most recent call last):
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 536, in execute
    output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 336, in get_output_data
    return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 310, in _async_map_node_over_list
    await process_inputs(input_dict, i)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 298, in process_inputs
    result = f(**inputs)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy_api\internal\__init__.py", line 149, in wrapped_func
    return method(locked_class, **inputs)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy_api\latest\_io.py", line 1833, in EXECUTE_NORMALIZED
    to_return = cls.execute(*args, **kwargs)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy_extras\nodes_custom_sampler.py", line 967, in execute
    samples = guider.sample(noise.generate_noise(latent), latent_image, sampler, sigmas, denoise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=noise.seed)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1319, in sample
    output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 115, in execute
    return self.wrappers[self.idx](self, *args, **kwargs)
           ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui-kjnodes\nodes\ltxv_nodes.py", line 916, in __call__
    out = executor(noise, latent_image, sampler, sigmas, denoise_mask, combined_callback, disable_pbar, seed, latent_shapes=latent_shapes)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 107, in __call__
    return new_executor.execute(*args, **kwargs)
           ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
    return self.original(*args, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1257, in outer_sample
    output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1232, in inner_sample
    samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
    return self.original(*args, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-TiledDiffusion\utils.py", line 34, in KSAMPLER_sample
    return orig_fn(*args, **kwargs)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1002, in sample
    samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils\_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\k_diffusion\sampling.py", line 218, in sample_euler_ancestral
    return sample_euler_ancestral_RF(model, x, sigmas, extra_args, callback, disable, eta, s_noise, noise_sampler)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils\_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\k_diffusion\sampling.py", line 248, in sample_euler_ancestral_RF
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 642, in __call__
    out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1205, in __call__
    return self.outer_predict_noise(*args, **kwargs)
           ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1212, in outer_predict_noise
    ).execute(x, timestep, model_options, seed)
      ~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
    return self.original(*args, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1215, in predict_noise
    return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 622, in sampling_function
    out = calc_cond_batch(model, conds, x, timestep, model_options)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 210, in calc_cond_batch
    return _calc_cond_batch_outer(model, conds, x_in, timestep, model_options)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 218, in _calc_cond_batch_outer
    return executor.execute(model, conds, x_in, timestep, model_options)
           ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
    return self.original(*args, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 334, in _calc_cond_batch
    output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
             ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_base.py", line 182, in apply_model
    return comfy.patcher_extension.WrapperExecutor.new_class_executor(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ...<2 lines>...
        comfy.patcher_extension.get_all_wrappers(comfy.patcher_extension.WrappersMP.APPLY_MODEL, transformer_options)
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ).execute(x, t, c_concat, c_crossattn, control, transformer_options, **kwargs)
    ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
    return self.original(*args, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_base.py", line 226, in _apply_model
    model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\av_model.py", line 1069, in forward
    return super().forward(
           ~~~~~~~~~~~~~~~^
        x,
        ^^
    ...<6 lines>...
        **kwargs,
        ^^^^^^^^^
    )
    ^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\model.py", line 936, in forward
    return comfy.patcher_extension.WrapperExecutor.new_class_executor(
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    ...<4 lines>...
        ),
        ~~
    ).execute(x, timestep, context, attention_mask, frame_rate, transformer_options, keyframe_idxs, denoise_mask=denoise_mask, **kwargs)
    ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
    return self.original(*args, **kwargs)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\model.py", line 989, in _forward
    x = self._process_transformer_blocks(
        x, context, attention_mask, timestep, pe,
    ...<2 lines>...
        **merged_args,
    )
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\av_model.py", line 917, in _process_transformer_blocks
    comfy.model_prefetch.prefetch_queue_pop(prefetch_queue, vx.device, block)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_prefetch.py", line 53, in prefetch_queue_pop
    offload_stream = comfy.ops.cast_modules_with_vbar(comfy_modules, None, device, None, True)
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 214, in cast_modules_with_vbar
    handle_pin(s, pin, xfer_source, xfer_dest, size=dest_size)
    ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 202, in handle_pin
    comfy.model_management.cast_to_gathered(source, pin, non_blocking=non_blocking, stream=offload_stream, r2=dest)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 1447, in cast_to_gathered
    if comfy.memory_management.read_tensor_file_slice_into(tensor, dest_view, stream=stream, destination2=dest2_view):
       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\memory_management.py", line 59, in read_tensor_file_slice_into
    hostbuf.read_file_slice(file_obj, info.offset, info.size,
    ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                            offset=destination.data_ptr() - hostbuf.get_raw_address(),
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                            stream=stream_ptr,
                            ^^^^^^^^^^^^^^^^^^
                            device_ptr=device_ptr,
                            ^^^^^^^^^^^^^^^^^^^^^^
                            device=None if destination2 is None else destination2.device.index)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\comfy_aimdo\host_buffer.py", line 85, in read_file_slice
    raise RuntimeError("HostBuffer.read_file_slice failed")
RuntimeError: HostBuffer.read_file_slice failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 541, in execute
    comfy.model_management.reset_cast_buffers()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 1366, in reset_cast_buffers
    offload_stream.synchronize()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\cuda\streams.py", line 102, in synchronize
    super().synchronize()
    ~~~~~~~~~~~~~~~~~~~^^
torch.AcceleratorError: CUDA error: out of memory
Search for `cudaErrorMemoryAllocation' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "threading.py", line 1043, in _bootstrap_inner
  File "threading.py", line 994, in run
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\main.py", line 327, in prompt_worker
    e.execute(item[2], prompt_id, extra_data, item[4])
    ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 714, in execute
    asyncio.run(self.execute_async(prompt, prompt_id, extra_data, execute_outputs))
    ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "asyncio\runners.py", line 195, in run
  File "asyncio\runners.py", line 118, in run
  File "asyncio\base_events.py", line 725, in run_until_complete
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 774, in execute_async
    result, error, ex = await execute(self.server, dynamic_prompt, self.caches, node_id, extra_data, executed, prompt_id, execution_list, pending_subgraph_results, pending_async_nodes, ui_node_outputs)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 632, in execute
    comfy.model_management.unload_all_models()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 1964, in unload_all_models
    free_memory(1e30, device)
    ~~~~~~~~~~~^^^^^^^^^^^^^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 790, in free_memory
    cleanup_models_gc()
    ~~~~~~~~~~~~~~~~~^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 941, in cleanup_models_gc
    reset_cast_buffers()
    ~~~~~~~~~~~~~~~~~~^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 1366, in reset_cast_buffers
    offload_stream.synchronize()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\cuda\streams.py", line 102, in synchronize
    super().synchronize()
    ~~~~~~~~~~~~~~~~~~~^^
torch.AcceleratorError: CUDA error: out of memory
Search for `cudaErrorMemoryAllocation' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
```

### Other

<img width="1089" height="365" alt="Image" src="https://github.com/user-attachments/assets/5fdad534-5337-4d76-b3c5-22ca66b2ff31" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OOM error after commit 0a2dd86 #14126

Custom Node Testing

Expected Behavior

Actual Behavior

Steps to Reproduce

Debug Logs

Other

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

OOM error after commit 0a2dd86 #14126

Description

Custom Node Testing

Expected Behavior

Actual Behavior

Steps to Reproduce

Debug Logs

Other

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions