Disable dynamic_vram when using torch compiler by rattus128 · Pull Request #12612 · Comfy-Org/ComfyUI

rattus128 · 2026-02-24T15:41:31Z

Implement ModelPatcher dynamic -> regular demotion on clone and use it to disable dynamic_vram for the torch compiler.

Stay tuned for dynamic_vram + torch compile! But for the moment, get the master in better shape but using the existing interoperability of dynamic and non-dynamic to just demote torch compiles to regular ModelPatcher.

This borrows a trick from @Kosinkadink in the WIP worksplit multi-gpu branch which caches the construction stage for ModelPatcher, which we can then replay to reconstruct a non dynamic MP from a dynamic. Note this doesn't cost any resources going in this direction, as Dynamic model patches are NOP on load (Going the other way would be bad because then there would be a redundant copy of the model in RAM due to ModelPatchers upfront load to RAM).

Example Test Conditions:

RTX5090, 96GB RAM
TORCH_LOGS="recompiles,graph_breaks" TORCH_COMPILE_DEBUG=1 TORCHDYNAMO_VERBOSE=1
--fast dynamic_vram fp8_matrix_mult
Flux1-dev 1024x1024 w/ native torch compiler node.

Before:

Large numbers of Graph Breaks and guard fails:

...
0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks] Graph break in user code at /home/rattus/venv/lib/python3.12/site-packages/comfy_aimdo/model_vbar.py:71
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks] Graph Break Reason: Unsupported method call
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks]   Explanation: Dynamo does not know how to trace method `__mul__` of class `PyCSimpleType`
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks]   Hint: Avoid calling `PyCSimpleType.__mul__` in your code.
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks]   Hint: Please report an issue to PyTorch.
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks] 
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks]   Developer debug context: call_method UserDefinedClassVariable(<class 'ctypes.c_uint'>) __mul__ [ConstantVariable(int: 3)] {}
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks] 
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/0] [__graph_breaks]  For more details about this graph break, please visit: https://meta-pytorch.github.io/compile-graph-break-site/gb/gb0156.html
V0225 01:24:18.487000 93128 /disk/sync/venv/lib/python3.12/site-packages/torch/_dynamo/symbolic_convert.py:611] [50/
...

100%|██████████| 6/6 [00:02<00:00,  2.02it/s]
Requested to load AutoencodingEngine
Model AutoencodingEngine prepared for dynamic VRAM loading. 159MB Staged. 0 patches attached.
Prompt executed in 36.12 seconds

second run, graph keeps breaking

100%|██████████| 6/6 [00:02<00:00,  2.98it/s]
Model AutoencodingEngine prepared for dynamic VRAM loading. 159MB Staged. 0 patches attached.
Prompt executed in 2.46 seconds

After

Model FluxClipModel_ prepared for dynamic VRAM loading. 9319MB Staged. 0 patches attached.
model weight dtype torch.float8_e4m3fn, manual cast: torch.bfloat16
model_type FLUX
Using pytorch attention in VAE
Using pytorch attention in VAE
VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
model weight dtype torch.float8_e4m3fn, manual cast: torch.bfloat16
model_type FLUX
Requested to load Flux
loaded completely; 20521.69 MB usable, 11350.07 MB loaded, full load: True
  0%|          | 0/6 [00:00<?, ?it/s,   Model Initializing ...  ]W0225 01:37:18.198000 96663 /disk/sync/venv/lib/python3.12/site-packages/torch/_inductor/debug.py:507] [0/0] model__0_inference_0 debug trace: /home/rattus/torch_compile_debug/run_2026_02_25_01_35_57_568404-pid_96663/torchinductor/model__0_inference_0.0
100%|██████████| 6/6 [00:01<00:00,  4.48it/s]                                   
Requested to load AutoencodingEngine
Model AutoencodingEngine prepared for dynamic VRAM loading. 159MB Staged. 0 patches attached.
Prompt executed in 95.93 seconds
got prompt
100%|██████████| 6/6 [00:01<00:00,  5.16it/s]                                   
Model AutoencodingEngine prepared for dynamic VRAM loading. 159MB Staged. 0 patches attached.
Prompt executed in 1.94 seconds

When making a model-patcher from a unet or ckpt, attach a callable function that can be called to replay the model construction. This can be used to deep clone model patcher WRT the actual model. Originally written by Kosinkadink Comfy-Org@f4b99bc

Add a clone argument that lets a caller clone a ModelPatcher but disable dynamic to demote the clone to regular MP. This is useful for legacy features where dynamic_vram support is missing or TBD.

This is a bigger feature. Disable for the interim to preserve functionality.

coderabbitai · 2026-02-24T15:46:32Z

📝 Walkthrough

Walkthrough

The PR adds infrastructure to optionally disable dynamic model patching behavior. It introduces a cached_patcher_init attribute to ModelPatcher to cache initialization state for reuse. The clone method gains a disable_dynamic parameter to control whether cloned instances use dynamic or static patcher implementations. The checkpoint and model loading pipeline is extended with disable_dynamic parameters that propagate through various loading functions, allowing selection between ModelPatcher and CoreModelPatcher implementations. A new model-only loader function is introduced. The torch compile node is updated to use non-dynamic cloning.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately reflects the main change: disabling dynamic_vram when using the torch compiler through ModelPatcher demotion on clone.
Description check	✅ Passed	The description is highly detailed and directly related to the changeset, explaining the mechanism, rationale, and performance testing results.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

comfy/sd.py (1)

1599-1600: Consider renaming the local variable to avoid shadowing the class name.

ModelPatcher as a local variable shadows comfy.model_patcher.ModelPatcher, which can be confusing when reading the surrounding code. A name like patcher_cls would be clearer.

Proposed rename

-        ModelPatcher = comfy.model_patcher.ModelPatcher if disable_dynamic else comfy.model_patcher.CoreModelPatcher
-        model_patcher = ModelPatcher(model, load_device=load_device, offload_device=model_management.unet_offload_device())
+        patcher_cls = comfy.model_patcher.ModelPatcher if disable_dynamic else comfy.model_patcher.CoreModelPatcher
+        model_patcher = patcher_cls(model, load_device=load_device, offload_device=model_management.unet_offload_device())

Same applies to lines 1735–1736 in load_diffusion_model_state_dict.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@comfy/sd.py` around lines 1599 - 1600, Rename the local variable currently
named ModelPatcher to avoid shadowing the class name: choose a clearer name like
patcher_cls (or similar) where you select between
comfy.model_patcher.ModelPatcher and comfy.model_patcher.CoreModelPatcher, then
instantiate with patcher_cls(model, load_device=load_device,
offload_device=model_management.unet_offload_device()) and keep the instance as
model_patcher; apply the same rename in load_diffusion_model_state_dict where
the same shadowing occurs (lines around the selection/instantiation of
ModelPatcher/CoreModelPatcher) so class names are not overwritten by local
variables.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy/model_patcher.py`:
- Around line 311-319: The new disable_dynamic branch in ModelPatcher.clone
accesses self.cached_patcher_init[0] without validation; add a defensive guard
before using it: check that self.cached_patcher_init is not None and has the
expected contents (e.g., is a sequence with at least one element) and if not,
raise a clear error (RuntimeError/ValueError) indicating cached_patcher_init is
missing for dynamic patcher, otherwise proceed to construct temp_model_patcher =
self.cached_patcher_init[0](*self.cached_patcher_init[1], disable_dynamic=True)
and continue as before; update any error text to reference clone,
cached_patcher_init, and temp_model_patcher so the failure is explicit.

---

Nitpick comments:
In `@comfy/sd.py`:
- Around line 1599-1600: Rename the local variable currently named ModelPatcher
to avoid shadowing the class name: choose a clearer name like patcher_cls (or
similar) where you select between comfy.model_patcher.ModelPatcher and
comfy.model_patcher.CoreModelPatcher, then instantiate with patcher_cls(model,
load_device=load_device, offload_device=model_management.unet_offload_device())
and keep the instance as model_patcher; apply the same rename in
load_diffusion_model_state_dict where the same shadowing occurs (lines around
the selection/instantiation of ModelPatcher/CoreModelPatcher) so class names are
not overwritten by local variables.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 048dd2f and ea6de0c.

📒 Files selected for processing (3)

comfy/model_patcher.py
comfy/sd.py
comfy_extras/nodes_torch_compile.py

coderabbitai · 2026-02-24T15:46:35Z

comfy/model_patcher.py

+    def clone(self, disable_dynamic=False):
+        class_ = self.__class__
+        model = self.model
+        if self.is_dynamic() and disable_dynamic:
+            class_ = ModelPatcher
+            temp_model_patcher = self.cached_patcher_init[0](*self.cached_patcher_init[1], disable_dynamic=True)
+            model = temp_model_patcher.model
+
+        n = class_(model, self.load_device, self.offload_device, self.model_size(), weight_inplace_update=self.weight_inplace_update)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find all ModelPatcherDynamic instantiation rg -n "ModelPatcherDynamic\(" --type=py -C3

Repository: Comfy-Org/ComfyUI

Length of output: 571

🏁 Script executed:

#!/bin/bash # Find all cached_patcher_init assignments rg -n "cached_patcher_init\s*=" --type=py -C2

Repository: Comfy-Org/ComfyUI

Length of output: 1175

🏁 Script executed:

#!/bin/bash # Check the ModelPatcherDynamic class definition and __new__ method rg -n "class ModelPatcherDynamic" --type=py -A 20

Repository: Comfy-Org/ComfyUI

Length of output: 1683

🏁 Script executed:

#!/bin/bash # Look for any direct instantiation patterns that might bypass sd.py loading rg -n "ModelPatcher.*\(" --type=py | grep -E "(cached_patcher_init|__init__|__new__)" | head -20

Repository: Comfy-Org/ComfyUI

Length of output: 43

Add defensive guard for cached_patcher_init in new disable_dynamic code path.

The new disable_dynamic parameter introduces a code path that accesses self.cached_patcher_init[0] at line 316 without verifying it's not None. While the current design ensures cached_patcher_init is set for all dynamically-loaded patchers, an explicit guard would prevent an opaque TypeError if this assumption is violated:

def clone(self, disable_dynamic=False): class_ = self.__class__ model = self.model if self.is_dynamic() and disable_dynamic: + if self.cached_patcher_init is None: + raise RuntimeError("Cannot demote dynamic ModelPatcher to non-dynamic: cached_patcher_init was not set during model loading.") class_ = ModelPatcher temp_model_patcher = self.cached_patcher_init[0](*self.cached_patcher_init[1], disable_dynamic=True) model = temp_model_patcher.model

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@comfy/model_patcher.py` around lines 311 - 319, The new disable_dynamic branch in ModelPatcher.clone accesses self.cached_patcher_init[0] without validation; add a defensive guard before using it: check that self.cached_patcher_init is not None and has the expected contents (e.g., is a sequence with at least one element) and if not, raise a clear error (RuntimeError/ValueError) indicating cached_patcher_init is missing for dynamic patcher, otherwise proceed to construct temp_model_patcher = self.cached_patcher_init[0](*self.cached_patcher_init[1], disable_dynamic=True) and continue as before; update any error text to reference clone, cached_patcher_init, and temp_model_patcher so the failure is explicit.

rattus128 added 3 commits February 25, 2026 01:02

mp: Add disable_dynamic clone argument

73a3942

Add a clone argument that lets a caller clone a ModelPatcher but disable dynamic to demote the clone to regular MP. This is useful for legacy features where dynamic_vram support is missing or TBD.

torch_compile: disable dynamic_vram

ea6de0c

This is a bigger feature. Disable for the interim to preserve functionality.

rattus128 requested review from Kosinkadink, comfyanonymous and guill as code owners February 24, 2026 15:41

coderabbitai bot reviewed Feb 24, 2026

View reviewed changes

comfyanonymous merged commit 3ebe1ac into Comfy-Org:master Feb 25, 2026
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Disable dynamic_vram when using torch compiler#12612

Disable dynamic_vram when using torch compiler#12612
comfyanonymous merged 3 commits intoComfy-Org:masterfrom
rattus128:prs/dynamic-vram-fixes/disable-compile

rattus128 commented Feb 24, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Feb 24, 2026

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

rattus128 commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Feb 24, 2026

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rattus128 commented Feb 24, 2026 •

edited

Loading