Disable dynamic_vram when using torch compiler#12612
Disable dynamic_vram when using torch compiler#12612comfyanonymous merged 3 commits intoComfy-Org:masterfrom
Conversation
When making a model-patcher from a unet or ckpt, attach a callable function that can be called to replay the model construction. This can be used to deep clone model patcher WRT the actual model. Originally written by Kosinkadink Comfy-Org@f4b99bc
Add a clone argument that lets a caller clone a ModelPatcher but disable dynamic to demote the clone to regular MP. This is useful for legacy features where dynamic_vram support is missing or TBD.
This is a bigger feature. Disable for the interim to preserve functionality.
📝 WalkthroughWalkthroughThe PR adds infrastructure to optionally disable dynamic model patching behavior. It introduces a 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
comfy/sd.py (1)
1599-1600: Consider renaming the local variable to avoid shadowing the class name.
ModelPatcheras a local variable shadowscomfy.model_patcher.ModelPatcher, which can be confusing when reading the surrounding code. A name likepatcher_clswould be clearer.Proposed rename
- ModelPatcher = comfy.model_patcher.ModelPatcher if disable_dynamic else comfy.model_patcher.CoreModelPatcher - model_patcher = ModelPatcher(model, load_device=load_device, offload_device=model_management.unet_offload_device()) + patcher_cls = comfy.model_patcher.ModelPatcher if disable_dynamic else comfy.model_patcher.CoreModelPatcher + model_patcher = patcher_cls(model, load_device=load_device, offload_device=model_management.unet_offload_device())Same applies to lines 1735–1736 in
load_diffusion_model_state_dict.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@comfy/sd.py` around lines 1599 - 1600, Rename the local variable currently named ModelPatcher to avoid shadowing the class name: choose a clearer name like patcher_cls (or similar) where you select between comfy.model_patcher.ModelPatcher and comfy.model_patcher.CoreModelPatcher, then instantiate with patcher_cls(model, load_device=load_device, offload_device=model_management.unet_offload_device()) and keep the instance as model_patcher; apply the same rename in load_diffusion_model_state_dict where the same shadowing occurs (lines around the selection/instantiation of ModelPatcher/CoreModelPatcher) so class names are not overwritten by local variables.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@comfy/model_patcher.py`:
- Around line 311-319: The new disable_dynamic branch in ModelPatcher.clone
accesses self.cached_patcher_init[0] without validation; add a defensive guard
before using it: check that self.cached_patcher_init is not None and has the
expected contents (e.g., is a sequence with at least one element) and if not,
raise a clear error (RuntimeError/ValueError) indicating cached_patcher_init is
missing for dynamic patcher, otherwise proceed to construct temp_model_patcher =
self.cached_patcher_init[0](*self.cached_patcher_init[1], disable_dynamic=True)
and continue as before; update any error text to reference clone,
cached_patcher_init, and temp_model_patcher so the failure is explicit.
---
Nitpick comments:
In `@comfy/sd.py`:
- Around line 1599-1600: Rename the local variable currently named ModelPatcher
to avoid shadowing the class name: choose a clearer name like patcher_cls (or
similar) where you select between comfy.model_patcher.ModelPatcher and
comfy.model_patcher.CoreModelPatcher, then instantiate with patcher_cls(model,
load_device=load_device, offload_device=model_management.unet_offload_device())
and keep the instance as model_patcher; apply the same rename in
load_diffusion_model_state_dict where the same shadowing occurs (lines around
the selection/instantiation of ModelPatcher/CoreModelPatcher) so class names are
not overwritten by local variables.
| def clone(self, disable_dynamic=False): | ||
| class_ = self.__class__ | ||
| model = self.model | ||
| if self.is_dynamic() and disable_dynamic: | ||
| class_ = ModelPatcher | ||
| temp_model_patcher = self.cached_patcher_init[0](*self.cached_patcher_init[1], disable_dynamic=True) | ||
| model = temp_model_patcher.model | ||
|
|
||
| n = class_(model, self.load_device, self.offload_device, self.model_size(), weight_inplace_update=self.weight_inplace_update) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find all ModelPatcherDynamic instantiation
rg -n "ModelPatcherDynamic\(" --type=py -C3Repository: Comfy-Org/ComfyUI
Length of output: 571
🏁 Script executed:
#!/bin/bash
# Find all cached_patcher_init assignments
rg -n "cached_patcher_init\s*=" --type=py -C2Repository: Comfy-Org/ComfyUI
Length of output: 1175
🏁 Script executed:
#!/bin/bash
# Check the ModelPatcherDynamic class definition and __new__ method
rg -n "class ModelPatcherDynamic" --type=py -A 20Repository: Comfy-Org/ComfyUI
Length of output: 1683
🏁 Script executed:
#!/bin/bash
# Look for any direct instantiation patterns that might bypass sd.py loading
rg -n "ModelPatcher.*\(" --type=py | grep -E "(cached_patcher_init|__init__|__new__)" | head -20Repository: Comfy-Org/ComfyUI
Length of output: 43
Add defensive guard for cached_patcher_init in new disable_dynamic code path.
The new disable_dynamic parameter introduces a code path that accesses self.cached_patcher_init[0] at line 316 without verifying it's not None. While the current design ensures cached_patcher_init is set for all dynamically-loaded patchers, an explicit guard would prevent an opaque TypeError if this assumption is violated:
def clone(self, disable_dynamic=False):
class_ = self.__class__
model = self.model
if self.is_dynamic() and disable_dynamic:
+ if self.cached_patcher_init is None:
+ raise RuntimeError("Cannot demote dynamic ModelPatcher to non-dynamic: cached_patcher_init was not set during model loading.")
class_ = ModelPatcher
temp_model_patcher = self.cached_patcher_init[0](*self.cached_patcher_init[1], disable_dynamic=True)
model = temp_model_patcher.model🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@comfy/model_patcher.py` around lines 311 - 319, The new disable_dynamic
branch in ModelPatcher.clone accesses self.cached_patcher_init[0] without
validation; add a defensive guard before using it: check that
self.cached_patcher_init is not None and has the expected contents (e.g., is a
sequence with at least one element) and if not, raise a clear error
(RuntimeError/ValueError) indicating cached_patcher_init is missing for dynamic
patcher, otherwise proceed to construct temp_model_patcher =
self.cached_patcher_init[0](*self.cached_patcher_init[1], disable_dynamic=True)
and continue as before; update any error text to reference clone,
cached_patcher_init, and temp_model_patcher so the failure is explicit.
Implement ModelPatcher dynamic -> regular demotion on clone and use it to disable dynamic_vram for the torch compiler.
Stay tuned for dynamic_vram + torch compile! But for the moment, get the master in better shape but using the existing interoperability of dynamic and non-dynamic to just demote torch compiles to regular ModelPatcher.
This borrows a trick from @Kosinkadink in the WIP worksplit multi-gpu branch which caches the construction stage for ModelPatcher, which we can then replay to reconstruct a non dynamic MP from a dynamic. Note this doesn't cost any resources going in this direction, as Dynamic model patches are NOP on load (Going the other way would be bad because then there would be a redundant copy of the model in RAM due to ModelPatchers upfront load to RAM).
Example Test Conditions:
RTX5090, 96GB RAM
TORCH_LOGS="recompiles,graph_breaks" TORCH_COMPILE_DEBUG=1 TORCHDYNAMO_VERBOSE=1
--fast dynamic_vram fp8_matrix_mult
Flux1-dev 1024x1024 w/ native torch compiler node.
Before:
Large numbers of Graph Breaks and guard fails:
second run, graph keeps breaking
After