Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions comfy/cli_args.py
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ class PerformanceFeature(enum.Enum):

parser.add_argument("--mmap-torch-files", action="store_true", help="Use mmap when loading ckpt/pt files.")
parser.add_argument("--disable-mmap", action="store_true", help="Don't use mmap when loading safetensors.")
parser.add_argument("--sft-alt-loader", action="store_true", help="Use alternate method to load safetensors.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -n "sft_alt_loader|sft-alt-loader|USE_ALT_SFT_LOADER" --type py -A2 -B2

Repository: Comfy-Org/ComfyUI

Length of output: 1160


Help text lacks clarity; sft abbreviation is inconsistent with neighboring flags.

The flag --sft-alt-loader with help text "Use alternate method to load safetensors." is too vague—users encountering Windows crashes with large models won't know this is what they need. Additionally, the abbreviated sft is inconsistent with neighboring flags like --disable-mmap and --mmap-torch-files, which use full words.

Renaming to --safetensors-alt-loader is feasible because this is a newly added flag with no existing users. If renamed, update the attribute reference in comfy/utils.py line 40 from args.sft_alt_loader to args.safetensors_alt_loader.

✏️ Suggested improvement
-parser.add_argument("--sft-alt-loader", action="store_true", help="Use alternate method to load safetensors.")
+parser.add_argument("--safetensors-alt-loader", action="store_true", help="Use an alternate (non-mmap) loader for safetensors files. Recommended on Windows when loading large models causes crashes or when --disable-mmap alone is insufficient.")

Then in comfy/utils.py line 40:

-USE_ALT_SFT_LOADER = args.sft_alt_loader
+USE_ALT_SFT_LOADER = args.safetensors_alt_loader
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
parser.add_argument("--sft-alt-loader", action="store_true", help="Use alternate method to load safetensors.")
parser.add_argument("--safetensors-alt-loader", action="store_true", help="Use an alternate (non-mmap) loader for safetensors files. Recommended on Windows when loading large models causes crashes or when --disable-mmap alone is insufficient.")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy/cli_args.py` at line 170, Rename the CLI flag and its internal
attribute from sft to safetensors for clarity: change the
parser.add_argument("--sft-alt-loader", ...) to use "--safetensors-alt-loader"
and update any uses of args.sft_alt_loader to args.safetensors_alt_loader
(notably the reference in comfy/utils.py where args.sft_alt_loader is read).
Ensure the help text is expanded to something explicit like "Use alternate
loader for safetensors (fixes Windows crashes with large models)" and update any
tests or docs that reference the old flag name.


parser.add_argument("--dont-print-server", action="store_true", help="Don't print server output.")
parser.add_argument("--quick-test-for-ci", action="store_true", help="Quick test for CI.")
Expand Down
14 changes: 10 additions & 4 deletions comfy/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@

MMAP_TORCH_FILES = args.mmap_torch_files
DISABLE_MMAP = args.disable_mmap
USE_ALT_SFT_LOADER = args.sft_alt_loader


if True: # ckpt/pt file whitelist for safe loading of old sd files
Expand Down Expand Up @@ -80,7 +81,7 @@ def encode(*args, **kwargs): # no longer necessary on newer torch
"U16": torch.uint16,
}

def load_safetensors(ckpt):
def load_safetensors(ckpt, device):
f = open(ckpt, "rb")
mapping = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
mv = memoryview(mapping)
Expand All @@ -102,7 +103,12 @@ def load_safetensors(ckpt):
with warnings.catch_warnings():
#We are working with read-only RAM by design
warnings.filterwarnings("ignore", message="The given buffer is not writable")
sd[name] = torch.frombuffer(mv[start:end], dtype=_TYPES[info["dtype"]]).view(info["shape"])
tensor = torch.frombuffer(mv[start:end], dtype=_TYPES[info["dtype"]]).view(info["shape"])
if DISABLE_MMAP:
tensor = tensor.to(device=device, copy=True)
elif (device != 'cpu' if isinstance(device, str) else device.type != 'cpu'):
tensor = tensor.to(device)
sd[name] = tensor

return sd, header.get("__metadata__", {}),

Expand All @@ -113,8 +119,8 @@ def load_torch_file(ckpt, safe_load=False, device=None, return_metadata=False):
metadata = None
if ckpt.lower().endswith(".safetensors") or ckpt.lower().endswith(".sft"):
try:
if enables_dynamic_vram():
sd, metadata = load_safetensors(ckpt)
if USE_ALT_SFT_LOADER or enables_dynamic_vram():
sd, metadata = load_safetensors(ckpt, device)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems a little strange, in that load_safetensors is very deliberately an MMAP. The main difference between load_safetensors and the sft package is the MMAP open flags so what you are doing here is a different kind of mmap.

You need to be careful here in that you are adding a path where non-dynamic is using this zero-copy sft loader which is incompatible with the non-dynamic memory pinner.

Try a workflow with --disable-mmap --novram and im suspicious you will see "Pin error"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are Pin error. shown, but unfortunately, ComfyUI says Default CPU Allocator cannot allocate after I add tensor.to(device=device, copy=True), and here is process manager comparsion.

No copy tensor:
2026-2-25 13-54-7

Copy tensor:
2026-2-25 13-52-4

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log: Copy tensor:

got prompt

E:\AI\ComfyUI>

Log: No copy tensor

got prompt
Found quantization metadata version 1
Using MixedPrecisionOps for text encoder
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
Requested to load LTXAVTEModel_

Pin error. repeat N times

loaded partially; 8902.00 MB usable, 1649.94 MB loaded, 10073.95 MB offloaded, 7252.06 MB buffer reserved, lowvram patches: 0
Error running sage attention: Input tensors must be in dtype of torch.float16 or torch.bfloat16, using pytorch attention instead.
Error running sage attention: Input tensors must be in dtype of torch.float16 or torch.bfloat16, using pytorch attention instead.
Error running sage attention: Input tensors must be in dtype of torch.float16 or torch.bfloat16, using pytorch attention instead.
Error running sage attention: Input tensors must be in dtype of torch.float16 or torch.bfloat16, using pytorch attention instead.
Found quantization metadata version 1
Detected mixed precision quantization
Using mixed precision operations
model weight dtype torch.bfloat16, manual cast: torch.bfloat16
model_type FLUX
VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
no CLIP/text encoder weights in checkpoint, the text encoder model will not be loaded.
Requested to load LTXAV

Pin error. repeat N times

loaded partially; 9551.67 MB usable, 9463.08 MB loaded, 12428.51 MB offloaded, 112.04 MB buffer reserved, lowvram patches: 0
100%|████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:33<00:00,  4.18s/it]
Unloaded partially: 976.66 MB freed, 8486.42 MB remains loaded, 112.04 MB buffer reserved, lowvram patches: 0
0 models unloaded.
Unloaded partially: 1508.48 MB freed, 6977.94 MB remains loaded, 368.54 MB buffer reserved, lowvram patches: 0
100%|████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:22<00:00,  7.44s/it]
Requested to load AudioVAE
loaded completely; 2571.74 MB usable, 415.20 MB loaded, full load: True
Requested to load VideoVAE
0 models unloaded.
loaded partially; 0.00 MB usable, 0.00 MB loaded, 2331.69 MB offloaded, 648.02 MB buffer reserved, lowvram patches: 0
Prompt executed in 95.03 seconds

if not return_metadata:
metadata = None
else:
Expand Down