Skip to content

ops: limit return of requants (Fix perf of some fp8 dynamic_vram workflows)#12506

Merged
comfyanonymous merged 1 commit intoComfy-Org:masterfrom
rattus128:prs/dynamic-vram-fixes/dont-requant
Feb 17, 2026
Merged

ops: limit return of requants (Fix perf of some fp8 dynamic_vram workflows)#12506
comfyanonymous merged 1 commit intoComfy-Org:masterfrom
rattus128:prs/dynamic-vram-fixes/dont-requant

Conversation

@rattus128
Copy link
Contributor

This check was far too broad and the dtype is not a reliable indicator of wanting the requant (as QT returns the compute dtype as the dtype). So explictly plumb whether fp8mm wants the requant or not.

Example Test Conditions:

Windows, RTX5060, 32GB RAM, --fast dynamic_vram
WAN2.1 fp8 scaled 14B + Lora

lion-burger

Before:

Requested to load WAN21
0 models unloaded.
Model WAN21 prepared for dynamic VRAM loading. 13630MB Staged. 1053 patches attached.
100%|████████████████████████████████████████████████████████████████████████████████████| 4/4 [03:38<00:00, 54.52s/it]
Requested to load WanVAE
Model WanVAE prepared for dynamic VRAM loading. 242MB Staged. 0 patches attached.
Prompt executed in 250.19 seconds

After:

Requested to load WAN21
0 models unloaded.
Model WAN21 prepared for dynamic VRAM loading. 13630MB Staged. 1053 patches attached.
100%|████████████████████████████████████████████████████████████████████████████████████| 4/4 [03:16<00:00, 49.10s/it]
Requested to load WanVAE
Model WanVAE prepared for dynamic VRAM loading. 242MB Staged. 0 patches attached.
Prompt executed in 229.95 seconds

This check was far too broad and the dtype is not a reliable indicator
of wanting the requant (as QT returns the compute dtype as the dtype).
So explictly plumb whether fp8mm wants the requant or not.
@comfyanonymous comfyanonymous merged commit 58dcc97 into Comfy-Org:master Feb 17, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants