Skip to content

load/offload with mmap#6

Closed
strint wants to merge 40 commits intomasterfrom
refine_offload
Closed

load/offload with mmap#6
strint wants to merge 40 commits intomasterfrom
refine_offload

Conversation

@strint
Copy link
Collaborator

@strint strint commented Oct 22, 2025

启动 comfyui 的命令要加上环境变量 MMAP_MEM_THRESHOLD_GB=5,含义是若 cpu mem 小于 5G 时,遇到 offload 会 offload 到 mmap,避免爆 cpu 内存

@ccssu
Copy link
Collaborator

ccssu commented Nov 14, 2025

问题描述

comfyui cpu oom
ComfyUI execute prompt error, please check the prompt (Wait ComfyUI prompt execution got unexpected error: ConnectionClosedError(no close frame received or sent))

复现

云函数地址:https://cloud.siliconflow.cn/sft-d1s6t1r3jrms73f3ltpg/dedicated/functions/fniglisnqx?tab=definition

公网 API 端点: https://fniglisnqx.fn.6scloud.com
工作流:
cpu_oom_workflow.json
cpu_oom_workflow_api.json

环境信息

  • cce镜像comfyui: hub.6scloud.com/d1s6t1r3jrms73f3ltpg/comfyui-gpu-torch2dot5:v202511141302-test-on-latest-cpu-offload-default-607450b 关联 comfyui commit :dc7c77e78cb219f149c448cb961ae5122be7ce6b
  • comfyagent: hub.6scloud.com/d1r7umcsfi9c73b4drdg/comfyagent:ComfyAgent-20251112-9a1c1b18

@strint
Copy link
Collaborator Author

strint commented Nov 14, 2025

问题描述

comfyui cpu oom ComfyUI execute prompt error, please check the prompt (Wait ComfyUI prompt execution got unexpected error: ConnectionClosedError(no close frame received or sent))

复现

云函数地址:https://cloud.siliconflow.cn/sft-d1s6t1r3jrms73f3ltpg/dedicated/functions/fniglisnqx?tab=definition

公网 API 端点: https://fniglisnqx.fn.6scloud.com 工作流: cpu_oom_workflow.json cpu_oom_workflow_api.json

环境信息

  • cce镜像comfyui: hub.6scloud.com/d1s6t1r3jrms73f3ltpg/comfyui-gpu-torch2dot5:v202511141302-test-on-latest-cpu-offload-default-607450b 关联 comfyui commit :dc7c77e78cb219f149c448cb961ae5122be7ce6b
  • comfyagent: hub.6scloud.com/d1r7umcsfi9c73b4drdg/comfyagent:ComfyAgent-20251112-9a1c1b18

看了工作流,里面有比较大的 lora 总共超过了 5G,所以那块有很大可能导致 OOM

可以先试下这个版本,增加了 lora 的 mmap:#8

@ccssu
Copy link
Collaborator

ccssu commented Nov 14, 2025

看了工作流,里面有比较大的 lora 总共超过了 5G,所以那块有很大可能导致 OOM

应该不是lora的问题,我把 lora取消掉还是会cpu oom

@strint
Copy link
Collaborator Author

strint commented Nov 20, 2025

问题描述

comfyui cpu oom ComfyUI execute prompt error, please check the prompt (Wait ComfyUI prompt execution got unexpected error: ConnectionClosedError(no close frame received or sent))

复现

云函数地址:https://cloud.siliconflow.cn/sft-d1s6t1r3jrms73f3ltpg/dedicated/functions/fniglisnqx?tab=definition

公网 API 端点: https://fniglisnqx.fn.6scloud.com 工作流: cpu_oom_workflow.json cpu_oom_workflow_api.json

环境信息

  • cce镜像comfyui: hub.6scloud.com/d1s6t1r3jrms73f3ltpg/comfyui-gpu-torch2dot5:v202511141302-test-on-latest-cpu-offload-default-607450b 关联 comfyui commit :dc7c77e78cb219f149c448cb961ae5122be7ce6b
  • comfyagent: hub.6scloud.com/d1r7umcsfi9c73b4drdg/comfyagent:ComfyAgent-20251112-9a1c1b18

测试工作流只有 clip 能走到 comfyui 的 model_unload 逻辑,看工作流里面大部分节点都是第三方节点,他们的 offload 看起来是第三方节点内部做的,不受 comfyui 控制了。

doombeaker and others added 10 commits November 26, 2025 16:58
* allow offload quant

* rm cuda

* refine and pass test
* Update workflow templates to v0.7.62 (Comfy-Org#11467)

* Make denoised output on custom sampler nodes work with nested tensors. (Comfy-Org#11471)

* api-nodes: use new custom endpoint for Nano Banana (Comfy-Org#11311)

* chore: update workflow templates to v0.7.63 (Comfy-Org#11482)

* ComfyUI v0.6.0

* Bump comfyui-frontend-package to 1.35.9 (Comfy-Org#11470)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* chore: update workflow templates to v0.7.64 (Comfy-Org#11496)

* Add a ManualSigmas node. (Comfy-Org#11499)

Can be used to manually set the sigmas for a model.

This node accepts a list of integer and floating point numbers separated
with any non numeric character.

* Specify in readme that we only support pytorch 2.4 and up. (Comfy-Org#11512)

* bump comfyui_manager version to the 4.0.4 (Comfy-Org#11521)

* Fix noise with ancestral samplers when inferencing on cpu. (Comfy-Org#11528)

* feat(api-nodes): add Kling Motion Control node (Comfy-Org#11493)

* [V3] converted nodes_images.py to V3 schema (Comfy-Org#11206)

* converted nodes_images.py to V3 schema

* fix test

* fix(api-nodes-gemini): always force enhance_prompt to be True (Comfy-Org#11503)

* chore(api-nodes): switch to credits instead of $ (Comfy-Org#11489)

* Enable async offload by default for AMD. (Comfy-Org#11534)

* Comment out unused norm_final in lumina/z image model. (Comfy-Org#11545)

* mm: discard async errors from pinning failures (Comfy-Org#10738)

Pretty much every error cudaHostRegister can throw also queues the same
error on the async GPU queue. This was fixed for repinning error case,
but there is the bad mmap and just enomem cases that are harder to
detect.

Do some dummy GPU work to clean the error state.

* Add some warnings for pin and unpin errors. (Comfy-Org#11561)

* ResizeByLongerSide: support video (Comfy-Org#11555)

(cherry picked from commit 98c6840aa4e5fd5407ba9ab113d209011e474bf6)

* chore(api-nodes-bytedance): mark "seededit" as deprecated, adjust display name of Seedream (Comfy-Org#11490)

* Add handling for vace_context in context windows (Comfy-Org#11386)

Co-authored-by: ozbayb <17261091+ozbayb@users.noreply.github.com>

* ComfyUI version v0.7.0

* Add support for sage attention 3 in comfyui, enable via new cli arg (Comfy-Org#11026)

* Add support for sage attention 3 in comfyui, enable via new cli arg
--use-sage-attiention3

* Fix some bugs found in PR review. The N dimension at which Sage
Attention 3 takes effect is reduced to 1024 (although the improvement is
not significant at this scale).

* Remove the Sage Attention3 switch, but retain the attention function
registration.

* Fix a ruff check issue in attention.py

* V3 Improvements + DynamicCombo + Autogrow exposed in public API (Comfy-Org#11345)

* Support Combo outputs in a more sane way

* Remove test validate_inputs function on test node

* Make curr_prefix be a list of strings instead of string for easier parsing as keys get added to dynamic types

* Start to account for id prefixes from frontend, need to fix bug with nested dynamics

* Ensure inputs/outputs/hidden are lists in schema finalize function, remove no longer needed 'is not None' checks

* Add raw_link and extra_dict to all relevant Inputs

* Make nested DynamicCombos work properly with prefixed keys on latest frontend; breaks old Autogrow, but is pretty much ready for upcoming Autogrow keys

* Replace ... usage with a MISSING sentinel for clarity in nodes_logic.py

* Added CustomCombo node in backend to reflect frontend node

* Prepare Autogrow's expand_schema_for_dynamic to work with upcoming frontend changes

* Prepare for look up table for dynamic input stuff

* More progress towards dynamic input lookup function stuff

* Finished converting _expand_schema_for_dynamic to be done via lookup instead of OOP to guarantee working with process isolation, did refactoring to remove old implementation + cleaning INPUT_TYPES definition including v3 hidden definition

* Change order of functions

* Removed some unneeded functions after dynamic refactor

* Make MatchType's output default displayname "MATCHTYPE"

* Fix DynamicSlot get_all

* Removed redundant code - dynamic stuff no longer happens in OOP way

* Natively support AnyType (*) without __ne__ hacks

* Remove stray code that made it in

* Remove expand_schema_for_dynamic left over on DynamicInput class

* get_dynamic() on DynamicInput/Output was not doing anything anymore, so removed it

* Make validate_inputs validate combo input correctly

* Temporarily comment out conversion to 'new' (9 month old) COMBO format in get_input_info

* Remove refrences to resources feature scrapped from V3

* Expose DynamicCombo in public API

* satisfy ruff after some code got commented out

* Make missing input error prettier for dynamic types

* Created a Switch2 node as a side-by-side test, will likely go with Switch2 as the initial switch node

* Figured out Switch situation

* Pass in v3_data in IsChangedCache.get function's fingerprint_inputs, add a from_v3_data helper method to HiddenHolder

* Switch order of Switch and Soft Switch nodes in file

* Temp test node for MatchType

* Fix missing v3_data for v1 nodes in validation

* For now, remove chacking duplicate id's for dynamic types

* Add Resize Image/Mask node that thanks to MatchType+DynamicCombo is 16-nodes-in-1

* Made DynamicCombo references in DCTestNode use public interface

* Add an AnyTypeTestNode

* Make lazy status for specific inputs on DynamicInputs work by having the values of the dictionary for check_lazy_status be a tuple, where the second element is the key of the input that can be returned

* Comment out test logic nodes

* Make primitive float's step make more sense

* Add (and leave commented out) some potential logic nodes

* Change default crop option to "center" on Resize Image/Mask node

* Changed copy.copy(d) to d.copy()

* Autogrow is available in stable  frontend, so exposing it in public API

* Use outputs id as display_name if no display_name present, remove v3 outputs id restriction that made them have to have unique IDs from the inputs

* Enable Custom Combo node as stable frontend now supports it

* Make id properly act like display_name on outputs

* Add Batch Images/Masks/Latents node

* Comment out Batch Images/Masks/Latents node for now, as Autogrow has a bug with MatchType where top connection is disconnected upon refresh

* Removed code for a couple test nodes in nodes_logic.py

* Add Batch Images, Batch Masks, and Batch Latents nodes with Autogrow, deprecate old Batch Images + LatentBatch nodes

* fix(api-nodes-vidu): preserve percent-encoding for signed URLs (Comfy-Org#11564)

* chore: update workflow templates to v0.7.65 (Comfy-Org#11579)

* Refactor: move clip_preprocess to comfy.clip_model (Comfy-Org#11586)

* Remove duplicate import of model_management (Comfy-Org#11587)

* New Year ruff cleanup. (Comfy-Org#11595)

* Ignore all frames except the first one for MPO format. (Comfy-Org#11569)

* Give Mahiro CFG a more appropriate display name (Comfy-Org#11580)

* Tripo3D: pass face_limit parameter only when it differs from default (Comfy-Org#11601)

* Remove leftover scaled_fp8 key. (Comfy-Org#11603)

* Print memory summary on OOM to help with debugging. (Comfy-Org#11613)

* feat(api-nodes): add support for 720p resolution for Kling Omni nodes (Comfy-Org#11604)

* Fix case where upscale model wouldn't be moved to cpu. (Comfy-Org#11633)

* Support the LTXV 2 model. (Comfy-Org#11632)

* Add LTXAVTextEncoderLoader node. (Comfy-Org#11634)

* Refactor module_size function. (Comfy-Org#11637)

* Fix name. (Comfy-Org#11638)

---------

Co-authored-by: ComfyUI Wiki <contact@comfyui-wiki.com>
Co-authored-by: comfyanonymous <121283862+comfyanonymous@users.noreply.github.com>
Co-authored-by: Alexander Piskun <13381981+bigcat88@users.noreply.github.com>
Co-authored-by: comfyanonymous <comfyanonymous@protonmail.com>
Co-authored-by: Comfy Org PR Bot <snomiao+comfy-pr@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dr.Lt.Data <128333288+ltdrdata@users.noreply.github.com>
Co-authored-by: rattus <46076784+rattus128@users.noreply.github.com>
Co-authored-by: Tavi Halperin <tavi@lightricks.com>
Co-authored-by: drozbay <17261091+drozbay@users.noreply.github.com>
Co-authored-by: ozbayb <17261091+ozbayb@users.noreply.github.com>
Co-authored-by: mengqin <mengqin@gmail.com>
Co-authored-by: Jedrzej Kosinski <kosinkadink1@gmail.com>
Co-authored-by: throttlekitty <throttlekitty@gmail.com>
@strint strint closed this Jan 9, 2026
@strint strint deleted the refine_offload branch January 9, 2026 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants