Way to skip self.model.to(device_to) inside ModelPatcher.unpatch_model

### Feature Idea

`ModelPatcher.unpatch_model(unpatch_weights=True)` ends its weight-unpatch block with `self.model.to(device_to)` (master `comfy/model_patcher.py` around the `current_weight_patches_uuid = None` / `backup.clear()` lines). That walk uses `nn.Module.to` which iterates every parameter, ignoring `comfy_cast_weights = True` on custom layers.

For custom ops that intentionally keep weights on `offload_device` and only cast on the fly through `comfy_cast_weights`, the walk hits parameters that are not safe to `.to(...)` (mmap-backed, lazily quantized, etc.) and faults. ComfyUI-GGUF hits this with Windows `EXCEPTION_ACCESS_VIOLATION` on mmap-backed quantized tensors (city96/ComfyUI-GGUF#444).

The only existing escape hatch is `unpatch_weights=False`, but that block also runs `unpatch_hooks`, `unpin_all_weights`, lowvram cleanup, non-quantized backup restoration, `model_loaded_weight_memory = 0`, `model_offload_buffer_memory = 0`, and `comfy_patched_weights` deletion. Custom patchers that want to keep all of that but skip the device walk end up mirroring the whole block by hand. city96/ComfyUI-GGUF#445 is the latest iteration of that workaround there.

Two shapes that would work:

1. Add `unpatch_device: bool = True` to `unpatch_model`, gating just the `self.model.to(device_to)` line. Defaults preserve behavior.
2. Replace `self.model.to(device_to)` with a per-module walk that skips modules with `comfy_cast_weights = True` (their weights are managed by the op layer, not by `ModelPatcher`).

Either would let custom patchers drop their `unpatch_model` overrides entirely.

### Existing Solutions

- ComfyUI-GGUF currently maintains a multi-iteration downstream workaround in `nodes.py:GGUFModelPatcher.unpatch_model` (see history at `6dbb4ba` -> `717a0e1` -> #355 -> #357 -> #445). The maintainer flagged this as needing an upstream fix in `717a0e1`'s commit message in September 2024.
- Other quantization custom nodes (e.g. bitsandbytes-based loaders) avoid the issue because their weights survive `.to`, but they would still benefit from the cleaner contract.

### Other

The `nn.Module.to` walk is also redundant on the `partially_load` call path (`comfy/model_patcher.py` `partially_load` calls `unpatch_model(self.offload_device, ...)` and then immediately re-loads), so option (2) would also avoid wasted work on that hot path. The `detach` path is the only caller that genuinely needs the weights on `offload_device` after unpatch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Way to skip self.model.to(device_to) inside ModelPatcher.unpatch_model #14142

Feature Idea

Existing Solutions

Other

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Way to skip self.model.to(device_to) inside ModelPatcher.unpatch_model #14142

Description

Feature Idea

Existing Solutions

Other

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions