Skip to content

Comments

Disable dynamic_vram when using torch compiler#12612

Merged
comfyanonymous merged 3 commits intoComfy-Org:masterfrom
rattus128:prs/dynamic-vram-fixes/disable-compile
Feb 25, 2026
Merged

Disable dynamic_vram when using torch compiler#12612
comfyanonymous merged 3 commits intoComfy-Org:masterfrom
rattus128:prs/dynamic-vram-fixes/disable-compile

Conversation

@rattus128
Copy link
Contributor

@rattus128 rattus128 commented Feb 24, 2026

Implement ModelPatcher dynamic -> regular demotion on clone and use it to disable dynamic_vram for the torch compiler.

Stay tuned for dynamic_vram + torch compile! But for the moment, get the master in better shape but using the existing interoperability of dynamic and non-dynamic to just demote torch compiles to regular ModelPatcher.

This borrows a trick from @Kosinkadink in the WIP worksplit multi-gpu branch which caches the construction stage for ModelPatcher, which we can then replay to reconstruct a non dynamic MP from a dynamic. Note this doesn't cost any resources going in this direction, as Dynamic model patches are NOP on load (Going the other way would be bad because then there would be a redundant copy of the model in RAM due to ModelPatchers upfront load to RAM).

Example Test Conditions:

RTX5090, 96GB RAM
TORCH_LOGS="recompiles,graph_breaks" TORCH_COMPILE_DEBUG=1 TORCHDYNAMO_VERBOSE=1
--fast dynamic_vram fp8_matrix_mult
Flux1-dev 1024x1024 w/ native torch compiler node.

image

Before:

Large numbers of Graph Breaks and guard fails:

...
0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks] Graph break in user code at /home/rattus/venv/lib/python3.12/site-packages/comfy_aimdo/model_vbar.py:71
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks] Graph Break Reason: Unsupported method call
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks]   Explanation: Dynamo does not know how to trace method `__mul__` of class `PyCSimpleType`
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks]   Hint: Avoid calling `PyCSimpleType.__mul__` in your code.
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks]   Hint: Please report an issue to PyTorch.
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks] 
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks]   Developer debug context: call_method UserDefinedClassVariable(<class 'ctypes.c_uint'>) __mul__ [ConstantVariable(int: 3)] {}
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks] 
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks]  For more details about this graph break, please visit: https://meta-pytorch.github.io/compile-graph-break-site/gb/gb0156.html
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/
...
100%|██████████| 6/6 [00:02<00:00,  2.02it/s]
Requested to load AutoencodingEngine
Model AutoencodingEngine prepared for dynamic VRAM loading. 159MB Staged. 0 patches attached.
Prompt executed in 36.12 seconds

second run, graph keeps breaking

100%|██████████| 6/6 [00:02<00:00,  2.98it/s]
Model AutoencodingEngine prepared for dynamic VRAM loading. 159MB Staged. 0 patches attached.
Prompt executed in 2.46 seconds

After

Model FluxClipModel_ prepared for dynamic VRAM loading. 9319MB Staged. 0 patches attached.
model weight dtype torch.float8_e4m3fn, manual cast: torch.bfloat16
model_type FLUX
Using pytorch attention in VAE
Using pytorch attention in VAE
VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
model weight dtype torch.float8_e4m3fn, manual cast: torch.bfloat16
model_type FLUX
Requested to load Flux
loaded completely; 20521.69 MB usable, 11350.07 MB loaded, full load: True
  0%|          | 0/6 [00:00<?, ?it/s,   Model Initializing ...  ]W0225 01:37:18.198000 96663 /disk/sync/venv/lib/python3.12/site-packages/torch/_inductor/debug.py:507] [0/0] model__0_inference_0 debug trace: /home/rattus/torch_compile_debug/run_2026_02_25_01_35_57_568404-pid_96663/torchinductor/model__0_inference_0.0
100%|██████████| 6/6 [00:01<00:00,  4.48it/s]                                   
Requested to load AutoencodingEngine
Model AutoencodingEngine prepared for dynamic VRAM loading. 159MB Staged. 0 patches attached.
Prompt executed in 95.93 seconds
got prompt
100%|██████████| 6/6 [00:01<00:00,  5.16it/s]                                   
Model AutoencodingEngine prepared for dynamic VRAM loading. 159MB Staged. 0 patches attached.
Prompt executed in 1.94 seconds

When making a model-patcher from a unet or ckpt, attach a callable
function that can be called to replay the model construction. This
can be used to deep clone model patcher WRT the actual model.

Originally written by Kosinkadink
Comfy-Org@f4b99bc
Add a clone argument that lets a caller clone a ModelPatcher but disable
dynamic to demote the clone to regular MP. This is useful for legacy
features where dynamic_vram support is missing or TBD.
This is a bigger feature. Disable for the interim to preserve
functionality.
@coderabbitai
Copy link

coderabbitai bot commented Feb 24, 2026

📝 Walkthrough

Walkthrough

The PR adds infrastructure to optionally disable dynamic model patching behavior. It introduces a cached_patcher_init attribute to ModelPatcher to cache initialization state for reuse. The clone method gains a disable_dynamic parameter to control whether cloned instances use dynamic or static patcher implementations. The checkpoint and model loading pipeline is extended with disable_dynamic parameters that propagate through various loading functions, allowing selection between ModelPatcher and CoreModelPatcher implementations. A new model-only loader function is introduced. The torch compile node is updated to use non-dynamic cloning.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: disabling dynamic_vram when using the torch compiler through ModelPatcher demotion on clone.
Description check ✅ Passed The description is highly detailed and directly related to the changeset, explaining the mechanism, rationale, and performance testing results.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
comfy/sd.py (1)

1599-1600: Consider renaming the local variable to avoid shadowing the class name.

ModelPatcher as a local variable shadows comfy.model_patcher.ModelPatcher, which can be confusing when reading the surrounding code. A name like patcher_cls would be clearer.

Proposed rename
-        ModelPatcher = comfy.model_patcher.ModelPatcher if disable_dynamic else comfy.model_patcher.CoreModelPatcher
-        model_patcher = ModelPatcher(model, load_device=load_device, offload_device=model_management.unet_offload_device())
+        patcher_cls = comfy.model_patcher.ModelPatcher if disable_dynamic else comfy.model_patcher.CoreModelPatcher
+        model_patcher = patcher_cls(model, load_device=load_device, offload_device=model_management.unet_offload_device())

Same applies to lines 1735–1736 in load_diffusion_model_state_dict.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy/sd.py` around lines 1599 - 1600, Rename the local variable currently
named ModelPatcher to avoid shadowing the class name: choose a clearer name like
patcher_cls (or similar) where you select between
comfy.model_patcher.ModelPatcher and comfy.model_patcher.CoreModelPatcher, then
instantiate with patcher_cls(model, load_device=load_device,
offload_device=model_management.unet_offload_device()) and keep the instance as
model_patcher; apply the same rename in load_diffusion_model_state_dict where
the same shadowing occurs (lines around the selection/instantiation of
ModelPatcher/CoreModelPatcher) so class names are not overwritten by local
variables.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy/model_patcher.py`:
- Around line 311-319: The new disable_dynamic branch in ModelPatcher.clone
accesses self.cached_patcher_init[0] without validation; add a defensive guard
before using it: check that self.cached_patcher_init is not None and has the
expected contents (e.g., is a sequence with at least one element) and if not,
raise a clear error (RuntimeError/ValueError) indicating cached_patcher_init is
missing for dynamic patcher, otherwise proceed to construct temp_model_patcher =
self.cached_patcher_init[0](*self.cached_patcher_init[1], disable_dynamic=True)
and continue as before; update any error text to reference clone,
cached_patcher_init, and temp_model_patcher so the failure is explicit.

---

Nitpick comments:
In `@comfy/sd.py`:
- Around line 1599-1600: Rename the local variable currently named ModelPatcher
to avoid shadowing the class name: choose a clearer name like patcher_cls (or
similar) where you select between comfy.model_patcher.ModelPatcher and
comfy.model_patcher.CoreModelPatcher, then instantiate with patcher_cls(model,
load_device=load_device, offload_device=model_management.unet_offload_device())
and keep the instance as model_patcher; apply the same rename in
load_diffusion_model_state_dict where the same shadowing occurs (lines around
the selection/instantiation of ModelPatcher/CoreModelPatcher) so class names are
not overwritten by local variables.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 048dd2f and ea6de0c.

📒 Files selected for processing (3)
  • comfy/model_patcher.py
  • comfy/sd.py
  • comfy_extras/nodes_torch_compile.py

Comment on lines +311 to +319
def clone(self, disable_dynamic=False):
class_ = self.__class__
model = self.model
if self.is_dynamic() and disable_dynamic:
class_ = ModelPatcher
temp_model_patcher = self.cached_patcher_init[0](*self.cached_patcher_init[1], disable_dynamic=True)
model = temp_model_patcher.model

n = class_(model, self.load_device, self.offload_device, self.model_size(), weight_inplace_update=self.weight_inplace_update)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find all ModelPatcherDynamic instantiation
rg -n "ModelPatcherDynamic\(" --type=py -C3

Repository: Comfy-Org/ComfyUI

Length of output: 571


🏁 Script executed:

#!/bin/bash
# Find all cached_patcher_init assignments
rg -n "cached_patcher_init\s*=" --type=py -C2

Repository: Comfy-Org/ComfyUI

Length of output: 1175


🏁 Script executed:

#!/bin/bash
# Check the ModelPatcherDynamic class definition and __new__ method
rg -n "class ModelPatcherDynamic" --type=py -A 20

Repository: Comfy-Org/ComfyUI

Length of output: 1683


🏁 Script executed:

#!/bin/bash
# Look for any direct instantiation patterns that might bypass sd.py loading
rg -n "ModelPatcher.*\(" --type=py | grep -E "(cached_patcher_init|__init__|__new__)" | head -20

Repository: Comfy-Org/ComfyUI

Length of output: 43


Add defensive guard for cached_patcher_init in new disable_dynamic code path.

The new disable_dynamic parameter introduces a code path that accesses self.cached_patcher_init[0] at line 316 without verifying it's not None. While the current design ensures cached_patcher_init is set for all dynamically-loaded patchers, an explicit guard would prevent an opaque TypeError if this assumption is violated:

    def clone(self, disable_dynamic=False):
        class_ = self.__class__
        model = self.model
        if self.is_dynamic() and disable_dynamic:
+           if self.cached_patcher_init is None:
+               raise RuntimeError("Cannot demote dynamic ModelPatcher to non-dynamic: cached_patcher_init was not set during model loading.")
            class_ = ModelPatcher
            temp_model_patcher = self.cached_patcher_init[0](*self.cached_patcher_init[1], disable_dynamic=True)
            model = temp_model_patcher.model
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy/model_patcher.py` around lines 311 - 319, The new disable_dynamic
branch in ModelPatcher.clone accesses self.cached_patcher_init[0] without
validation; add a defensive guard before using it: check that
self.cached_patcher_init is not None and has the expected contents (e.g., is a
sequence with at least one element) and if not, raise a clear error
(RuntimeError/ValueError) indicating cached_patcher_init is missing for dynamic
patcher, otherwise proceed to construct temp_model_patcher =
self.cached_patcher_init[0](*self.cached_patcher_init[1], disable_dynamic=True)
and continue as before; update any error text to reference clone,
cached_patcher_init, and temp_model_patcher so the failure is explicit.

@comfyanonymous comfyanonymous merged commit 3ebe1ac into Comfy-Org:master Feb 25, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants