After updating (to the latest version) I started having persistent memory errors. After testing, I discovered that the errors started with commit 0a2dd86. Reverting to version 04879a8. The workflow is working again. The OOM error occurs after the first sampler step.
I believe it's very hardware dependent so no way to reproduce.
[ERROR] !!! Exception during processing !!! CUDA error: out of memory
Search for `cudaErrorMemoryAllocation' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
[ERROR] Traceback (most recent call last):
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 536, in execute
output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 336, in get_output_data
return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 310, in _async_map_node_over_list
await process_inputs(input_dict, i)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 298, in process_inputs
result = f(**inputs)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy_api\internal\__init__.py", line 149, in wrapped_func
return method(locked_class, **inputs)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy_api\latest\_io.py", line 1833, in EXECUTE_NORMALIZED
to_return = cls.execute(*args, **kwargs)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy_extras\nodes_custom_sampler.py", line 967, in execute
samples = guider.sample(noise.generate_noise(latent), latent_image, sampler, sigmas, denoise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=noise.seed)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1319, in sample
output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 115, in execute
return self.wrappers[self.idx](self, *args, **kwargs)
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui-kjnodes\nodes\ltxv_nodes.py", line 916, in __call__
out = executor(noise, latent_image, sampler, sigmas, denoise_mask, combined_callback, disable_pbar, seed, latent_shapes=latent_shapes)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 107, in __call__
return new_executor.execute(*args, **kwargs)
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
return self.original(*args, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1257, in outer_sample
output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1232, in inner_sample
samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
return self.original(*args, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-TiledDiffusion\utils.py", line 34, in KSAMPLER_sample
return orig_fn(*args, **kwargs)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1002, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils\_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\k_diffusion\sampling.py", line 218, in sample_euler_ancestral
return sample_euler_ancestral_RF(model, x, sigmas, extra_args, callback, disable, eta, s_noise, noise_sampler)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils\_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\k_diffusion\sampling.py", line 248, in sample_euler_ancestral_RF
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 642, in __call__
out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1205, in __call__
return self.outer_predict_noise(*args, **kwargs)
~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1212, in outer_predict_noise
).execute(x, timestep, model_options, seed)
~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
return self.original(*args, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1215, in predict_noise
return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 622, in sampling_function
out = calc_cond_batch(model, conds, x, timestep, model_options)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 210, in calc_cond_batch
return _calc_cond_batch_outer(model, conds, x_in, timestep, model_options)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 218, in _calc_cond_batch_outer
return executor.execute(model, conds, x_in, timestep, model_options)
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
return self.original(*args, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 334, in _calc_cond_batch
output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_base.py", line 182, in apply_model
return comfy.patcher_extension.WrapperExecutor.new_class_executor(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...<2 lines>...
comfy.patcher_extension.get_all_wrappers(comfy.patcher_extension.WrappersMP.APPLY_MODEL, transformer_options)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
).execute(x, t, c_concat, c_crossattn, control, transformer_options, **kwargs)
~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
return self.original(*args, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_base.py", line 226, in _apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\av_model.py", line 1069, in forward
return super().forward(
~~~~~~~~~~~~~~~^
x,
^^
...<6 lines>...
**kwargs,
^^^^^^^^^
)
^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\model.py", line 936, in forward
return comfy.patcher_extension.WrapperExecutor.new_class_executor(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...<4 lines>...
),
~~
).execute(x, timestep, context, attention_mask, frame_rate, transformer_options, keyframe_idxs, denoise_mask=denoise_mask, **kwargs)
~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
return self.original(*args, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\model.py", line 989, in _forward
x = self._process_transformer_blocks(
x, context, attention_mask, timestep, pe,
...<2 lines>...
**merged_args,
)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\av_model.py", line 917, in _process_transformer_blocks
comfy.model_prefetch.prefetch_queue_pop(prefetch_queue, vx.device, block)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_prefetch.py", line 53, in prefetch_queue_pop
offload_stream = comfy.ops.cast_modules_with_vbar(comfy_modules, None, device, None, True)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 214, in cast_modules_with_vbar
handle_pin(s, pin, xfer_source, xfer_dest, size=dest_size)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 202, in handle_pin
comfy.model_management.cast_to_gathered(source, pin, non_blocking=non_blocking, stream=offload_stream, r2=dest)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 1447, in cast_to_gathered
if comfy.memory_management.read_tensor_file_slice_into(tensor, dest_view, stream=stream, destination2=dest2_view):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\memory_management.py", line 59, in read_tensor_file_slice_into
hostbuf.read_file_slice(file_obj, info.offset, info.size,
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
offset=destination.data_ptr() - hostbuf.get_raw_address(),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
stream=stream_ptr,
^^^^^^^^^^^^^^^^^^
device_ptr=device_ptr,
^^^^^^^^^^^^^^^^^^^^^^
device=None if destination2 is None else destination2.device.index)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\comfy_aimdo\host_buffer.py", line 85, in read_file_slice
raise RuntimeError("HostBuffer.read_file_slice failed")
RuntimeError: HostBuffer.read_file_slice failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 541, in execute
comfy.model_management.reset_cast_buffers()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 1366, in reset_cast_buffers
offload_stream.synchronize()
~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\cuda\streams.py", line 102, in synchronize
super().synchronize()
~~~~~~~~~~~~~~~~~~~^^
torch.AcceleratorError: CUDA error: out of memory
Search for `cudaErrorMemoryAllocation' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
[INFO] Memory summary:
|===========================================================================|
| PyTorch CUDA memory summary, device ID 0 |
|---------------------------------------------------------------------------|
| CUDA OOMs: 0 | cudaMalloc retries: 0 |
|===========================================================================|
| Metric | Cur Usage | Peak Usage | Tot Alloc | Tot Freed |
|---------------------------------------------------------------------------|
| Allocated memory | 331278 KiB | 1776 MiB | 0 B | 0 B |
| from large pool | 0 KiB | 0 MiB | 0 B | 0 B |
| from small pool | 0 KiB | 0 MiB | 0 B | 0 B |
|---------------------------------------------------------------------------|
| Active memory | 331278 KiB | 1776 MiB | 0 B | 0 B |
| from large pool | 0 KiB | 0 MiB | 0 B | 0 B |
| from small pool | 0 KiB | 0 MiB | 0 B | 0 B |
|---------------------------------------------------------------------------|
| Requested memory | 0 B | 0 B | 0 B | 0 B |
| from large pool | 0 B | 0 B | 0 B | 0 B |
| from small pool | 0 B | 0 B | 0 B | 0 B |
|---------------------------------------------------------------------------|
| GPU reserved memory | 1536 MiB | 2720 MiB | 0 B | 0 B |
| from large pool | 0 MiB | 0 MiB | 0 B | 0 B |
| from small pool | 0 MiB | 0 MiB | 0 B | 0 B |
|---------------------------------------------------------------------------|
| Non-releasable memory | 0 B | 0 B | 0 B | 0 B |
| from large pool | 0 B | 0 B | 0 B | 0 B |
| from small pool | 0 B | 0 B | 0 B | 0 B |
|---------------------------------------------------------------------------|
| Allocations | 0 | 0 | 0 | 0 |
| from large pool | 0 | 0 | 0 | 0 |
| from small pool | 0 | 0 | 0 | 0 |
|---------------------------------------------------------------------------|
| Active allocs | 0 | 0 | 0 | 0 |
| from large pool | 0 | 0 | 0 | 0 |
| from small pool | 0 | 0 | 0 | 0 |
|---------------------------------------------------------------------------|
| GPU reserved segments | 0 | 0 | 0 | 0 |
| from large pool | 0 | 0 | 0 | 0 |
| from small pool | 0 | 0 | 0 | 0 |
|---------------------------------------------------------------------------|
| Non-releasable allocs | 0 | 0 | 0 | 0 |
| from large pool | 0 | 0 | 0 | 0 |
| from small pool | 0 | 0 | 0 | 0 |
|---------------------------------------------------------------------------|
| Oversize allocations | 0 | 0 | 0 | 0 |
|---------------------------------------------------------------------------|
| Oversize GPU segments | 0 | 0 | 0 | 0 |
|===========================================================================|
[ERROR] Got an OOM, unloading all loaded models.
Exception in thread Thread-20 (prompt_worker):
Traceback (most recent call last):
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 536, in execute
output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 336, in get_output_data
return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 310, in _async_map_node_over_list
await process_inputs(input_dict, i)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 298, in process_inputs
result = f(**inputs)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy_api\internal\__init__.py", line 149, in wrapped_func
return method(locked_class, **inputs)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy_api\latest\_io.py", line 1833, in EXECUTE_NORMALIZED
to_return = cls.execute(*args, **kwargs)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy_extras\nodes_custom_sampler.py", line 967, in execute
samples = guider.sample(noise.generate_noise(latent), latent_image, sampler, sigmas, denoise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=noise.seed)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1319, in sample
output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 115, in execute
return self.wrappers[self.idx](self, *args, **kwargs)
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfyui-kjnodes\nodes\ltxv_nodes.py", line 916, in __call__
out = executor(noise, latent_image, sampler, sigmas, denoise_mask, combined_callback, disable_pbar, seed, latent_shapes=latent_shapes)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 107, in __call__
return new_executor.execute(*args, **kwargs)
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
return self.original(*args, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1257, in outer_sample
output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1232, in inner_sample
samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
return self.original(*args, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-TiledDiffusion\utils.py", line 34, in KSAMPLER_sample
return orig_fn(*args, **kwargs)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1002, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils\_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\k_diffusion\sampling.py", line 218, in sample_euler_ancestral
return sample_euler_ancestral_RF(model, x, sigmas, extra_args, callback, disable, eta, s_noise, noise_sampler)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\utils\_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\k_diffusion\sampling.py", line 248, in sample_euler_ancestral_RF
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 642, in __call__
out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1205, in __call__
return self.outer_predict_noise(*args, **kwargs)
~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1212, in outer_predict_noise
).execute(x, timestep, model_options, seed)
~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
return self.original(*args, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 1215, in predict_noise
return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 622, in sampling_function
out = calc_cond_batch(model, conds, x, timestep, model_options)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 210, in calc_cond_batch
return _calc_cond_batch_outer(model, conds, x_in, timestep, model_options)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 218, in _calc_cond_batch_outer
return executor.execute(model, conds, x_in, timestep, model_options)
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
return self.original(*args, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\samplers.py", line 334, in _calc_cond_batch
output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_base.py", line 182, in apply_model
return comfy.patcher_extension.WrapperExecutor.new_class_executor(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...<2 lines>...
comfy.patcher_extension.get_all_wrappers(comfy.patcher_extension.WrappersMP.APPLY_MODEL, transformer_options)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
).execute(x, t, c_concat, c_crossattn, control, transformer_options, **kwargs)
~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
return self.original(*args, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_base.py", line 226, in _apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\av_model.py", line 1069, in forward
return super().forward(
~~~~~~~~~~~~~~~^
x,
^^
...<6 lines>...
**kwargs,
^^^^^^^^^
)
^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\model.py", line 936, in forward
return comfy.patcher_extension.WrapperExecutor.new_class_executor(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...<4 lines>...
),
~~
).execute(x, timestep, context, attention_mask, frame_rate, transformer_options, keyframe_idxs, denoise_mask=denoise_mask, **kwargs)
~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\patcher_extension.py", line 114, in execute
return self.original(*args, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\model.py", line 989, in _forward
x = self._process_transformer_blocks(
x, context, attention_mask, timestep, pe,
...<2 lines>...
**merged_args,
)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ldm\lightricks\av_model.py", line 917, in _process_transformer_blocks
comfy.model_prefetch.prefetch_queue_pop(prefetch_queue, vx.device, block)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_prefetch.py", line 53, in prefetch_queue_pop
offload_stream = comfy.ops.cast_modules_with_vbar(comfy_modules, None, device, None, True)
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 214, in cast_modules_with_vbar
handle_pin(s, pin, xfer_source, xfer_dest, size=dest_size)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\ops.py", line 202, in handle_pin
comfy.model_management.cast_to_gathered(source, pin, non_blocking=non_blocking, stream=offload_stream, r2=dest)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 1447, in cast_to_gathered
if comfy.memory_management.read_tensor_file_slice_into(tensor, dest_view, stream=stream, destination2=dest2_view):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\memory_management.py", line 59, in read_tensor_file_slice_into
hostbuf.read_file_slice(file_obj, info.offset, info.size,
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
offset=destination.data_ptr() - hostbuf.get_raw_address(),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
stream=stream_ptr,
^^^^^^^^^^^^^^^^^^
device_ptr=device_ptr,
^^^^^^^^^^^^^^^^^^^^^^
device=None if destination2 is None else destination2.device.index)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\comfy_aimdo\host_buffer.py", line 85, in read_file_slice
raise RuntimeError("HostBuffer.read_file_slice failed")
RuntimeError: HostBuffer.read_file_slice failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 541, in execute
comfy.model_management.reset_cast_buffers()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 1366, in reset_cast_buffers
offload_stream.synchronize()
~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\cuda\streams.py", line 102, in synchronize
super().synchronize()
~~~~~~~~~~~~~~~~~~~^^
torch.AcceleratorError: CUDA error: out of memory
Search for `cudaErrorMemoryAllocation' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "threading.py", line 1043, in _bootstrap_inner
File "threading.py", line 994, in run
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\main.py", line 327, in prompt_worker
e.execute(item[2], prompt_id, extra_data, item[4])
~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 714, in execute
asyncio.run(self.execute_async(prompt, prompt_id, extra_data, execute_outputs))
~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "asyncio\runners.py", line 195, in run
File "asyncio\runners.py", line 118, in run
File "asyncio\base_events.py", line 725, in run_until_complete
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 774, in execute_async
result, error, ex = await execute(self.server, dynamic_prompt, self.caches, node_id, extra_data, executed, prompt_id, execution_list, pending_subgraph_results, pending_async_nodes, ui_node_outputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\execution.py", line 632, in execute
comfy.model_management.unload_all_models()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 1964, in unload_all_models
free_memory(1e30, device)
~~~~~~~~~~~^^^^^^^^^^^^^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 790, in free_memory
cleanup_models_gc()
~~~~~~~~~~~~~~~~~^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 941, in cleanup_models_gc
reset_cast_buffers()
~~~~~~~~~~~~~~~~~~^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\ComfyUI\comfy\model_management.py", line 1366, in reset_cast_buffers
offload_stream.synchronize()
~~~~~~~~~~~~~~~~~~~~~~~~~~^^
File "G:\ComfyUI_windows_portable_nvidia\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\cuda\streams.py", line 102, in synchronize
super().synchronize()
~~~~~~~~~~~~~~~~~~~^^
torch.AcceleratorError: CUDA error: out of memory
Search for `cudaErrorMemoryAllocation' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Custom Node Testing
Expected Behavior
LTX 2.3 workflow was working perfectly fine until 04879a8
Actual Behavior
After updating (to the latest version) I started having persistent memory errors. After testing, I discovered that the errors started with commit 0a2dd86. Reverting to version 04879a8. The workflow is working again. The OOM error occurs after the first sampler step.
Steps to Reproduce
I believe it's very hardware dependent so no way to reproduce.
Debug Logs
Other