Fix safetensor cause python crash on some windows platform by roj234 · Pull Request #12611 · Comfy-Org/ComfyUI

roj234 · 2026-02-24T14:58:53Z

Problem

Python always crash when loading ltx-2-19b-distilled-fp8.safetensors, even I have enough (48GB) memory.
Then I found utils.load_torch_file and enable --disable-mmap option, but problem persistent.
Later, I discovered that python crashed before first tensor loaded (or, before first f.get_tensor call return).

Solution

By simply call utils.load_safetensors, I'm able to load and inference without crash.
However, without change code, you must sacrificing inference speed by specify --fast dynamic_vram.
So I make this PR, change enables_dynamic_vram() check to DISABLE_MMAP or enables_dynamic_vram(), and hopefully this will fix #10896

Tested with `ltx-2-19b-distilled-fp8.safetensors` on Python 3.11, safetensors 0.6.2, Windows 10.

Before: Crashed consistently even with --disable-mmap.
After: Loads and inferences successfully with --disable-mmap.

I'm re-downloading weight (as I'm using GGUF) and test with latest safetensors 0.7.0 now

coderabbitai · 2026-02-24T15:03:16Z

📝 Walkthrough

Walkthrough

Added CLI flag --sft-alt-loader and global USE_ALT_SFT_LOADER. load_safetensors signature changed to accept device and callers (e.g., load_torch_file) now pass that device. Safetensors loading decision now checks USE_ALT_SFT_LOADER (or enables_dynamic_vram() / DISABLE_MMAP) to choose an alternate in-memory loader. When using the alternate loader, tensors are conditionally moved to the target device after reading instead of always performing in-place conversions.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	Title accurately describes the main objective: fixing a safetensor-related Python crash on Windows platforms, which is the core problem addressed by the PR changes.
Description check	✅ Passed	Description is well-related to the changeset, explaining the crash problem, the solution approach, and testing results on Windows with specific safetensors files.
Linked Issues check	✅ Passed	Changes implement the core solution from issue `#10896`: adding DISABLE_MMAP check to prevent double memory allocation by using non-mmap safetensors loader when mmap is disabled, directly addressing the linked issue requirements.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to the objectives: adding the --sft-alt-loader flag, modifying load_safetensors signature to accept device parameter, and updating the loader logic to check DISABLE_MMAP. No unrelated changes detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

comfy/utils.py (1)
116-123: ⚠️ Potential issue | 🟠 Major

DISABLE_MMAP still uses mmap internally and ignores the device parameter.

When DISABLE_MMAP is set, load_safetensors() is called, which explicitly uses mmap.mmap() to read the file and creates CPU tensors via torch.frombuffer(). This means the flag contradicts its intent—it doesn't actually disable mmap—and also bypasses the device parameter entirely. The else branch respects device=device.type, so tensors loaded with DISABLE_MMAP stay on CPU and require an extra move to reach the target device. Consider either implementing a true non-mmap path for DISABLE_MMAP that respects device placement, or document that this flag has different semantics than its name suggests.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy/utils.py` around lines 116 - 123, DISABLE_MMAP currently calls
load_safetensors (which uses mmap and creates CPU tensors) and ignores the
requested device; update the DISABLE_MMAP branch so it either (A) implements a
true non-mmap read path that loads tensors directly onto the requested device
(respecting the device parameter) or (B) clearly documents the differing
semantics and moves tensors to device after load — modify the DISABLE_MMAP
branch around load_safetensors(ckpt) in comfy/utils.py to avoid mmap-backed
reads and to honor device (or add an explicit torch.tensor(..., device=device)
placement step) and ensure return_metadata handling remains correct when
return_metadata is False; reference DISABLE_MMAP, load_safetensors,
safetensors.safe_open, device, and return_metadata when making the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@comfy/utils.py`:
- Around line 116-123: DISABLE_MMAP currently calls load_safetensors (which uses
mmap and creates CPU tensors) and ignores the requested device; update the
DISABLE_MMAP branch so it either (A) implements a true non-mmap read path that
loads tensors directly onto the requested device (respecting the device
parameter) or (B) clearly documents the differing semantics and moves tensors to
device after load — modify the DISABLE_MMAP branch around load_safetensors(ckpt)
in comfy/utils.py to avoid mmap-backed reads and to honor device (or add an
explicit torch.tensor(..., device=device) placement step) and ensure
return_metadata handling remains correct when return_metadata is False;
reference DISABLE_MMAP, load_safetensors, safetensors.safe_open, device, and
return_metadata when making the change.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 048dd2f and 6cfe0cf.

📒 Files selected for processing (1)

comfy/utils.py

roj234 · 2026-02-24T15:24:16Z

This problem persists in safetensors=0.7.0.

By the way, describe the 'crash' behavior: The Python process terminates abruptly without a Python-level backtrace. It triggers the Windows WerFault.exe handler and (often) generates a python.exe.dmp file.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

comfy/utils.py (1)

99-107: ⚠️ Potential issue | 🟠 Major

Empty tensors are never moved to the target device — device mismatch.

The start == end branch (line 99-100) creates a CPU tensor unconditionally and has no corresponding .to(device). When device is non-CPU (e.g., cuda), non-empty tensors are moved to the device while empty tensors stay on CPU, producing a mixed-device state dict that will fault at inference time.

Additionally, the sd[name].to(device) call (line 107) is nested inside warnings.catch_warnings(), whose sole purpose is suppressing the torch.frombuffer "not writable" warning. The .to() call doesn't trigger that warning and can be moved outside.

🐛 Proposed fix

     if start == end:
-        sd[name] = torch.empty(info["shape"], dtype =_TYPES[info["dtype"]])
+        t = torch.empty(info["shape"], dtype=_TYPES[info["dtype"]])
     else:
         with warnings.catch_warnings():
             `#We` are working with read-only RAM by design
             warnings.filterwarnings("ignore", message="The given buffer is not writable")
-            sd[name] = torch.frombuffer(mv[start:end], dtype=_TYPES[info["dtype"]]).view(info["shape"])
-            if (device != 'cpu' if isinstance(device, str) else device.type != 'cpu'):
-                sd[name] = sd[name].to(device)
+            t = torch.frombuffer(mv[start:end], dtype=_TYPES[info["dtype"]]).view(info["shape"])
+    if (device != 'cpu' if isinstance(device, str) else device.type != 'cpu'):
+        t = t.to(device)
+    sd[name] = t

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@comfy/utils.py` around lines 99 - 107, The empty-tensor branch creates CPU
tensors with torch.empty and never moves them to the target device, and the
sd[name].to(device) call is unnecessarily inside the warnings.catch_warnings
block; update the code so that after creating sd[name] in both branches (the
torch.empty(...) path and the torch.frombuffer(...) path) you apply the same
device move using the existing device-check conditional ((device != 'cpu' if
isinstance(device, str) else device.type != 'cpu')) and move the sd[name] =
sd[name].to(device) call out of the warnings.catch_warnings block so both empty
and non-empty tensors end up on the correct device.

🧹 Nitpick comments (1)

comfy/utils.py (1)
85-85: load_safetensors uses Python's mmap.mmap internally — may not solve all Windows crash scenarios.

The condition change (line 119) bypasses safetensors.safe_open's Rust mmap, but load_safetensors itself calls mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) (line 85) using Python's own mmap module. For the CPU device path (the default), returned tensors remain mmap-backed via this Python mmap and are not cloned to regular memory.

The fix works for the specific Windows crash reported, and the TODO comment already flags the approach's uncertainty. However, Python's mmap on Windows also lacks structured exception handling for certain storage configurations and may crash on unusual hardware setups.

For guaranteed mmap-free loading on fragile systems (per issue #10896), the CPU device branch would need sd[name].clone() after torch.frombuffer to fully detach from the memory mapping.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@comfy/utils.py` at line 85, load_safetensors currently uses Python's mmap
(mapping = mmap.mmap(...)) and returns mmap-backed tensors when using the CPU
device path; to guarantee mmap-free tensors on fragile Windows setups, after
creating tensors from the buffer (via torch.frombuffer) in the CPU branch,
explicitly detach by calling .clone() on each tensor (e.g., sd[name] =
sd[name].clone()) so returned tensors are not memory-mapped; update the CPU
device path logic in load_safetensors to clone tensors after torch.frombuffer
creation to fully remove mmap backing.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@comfy/utils.py`:
- Around line 99-107: The empty-tensor branch creates CPU tensors with
torch.empty and never moves them to the target device, and the
sd[name].to(device) call is unnecessarily inside the warnings.catch_warnings
block; update the code so that after creating sd[name] in both branches (the
torch.empty(...) path and the torch.frombuffer(...) path) you apply the same
device move using the existing device-check conditional ((device != 'cpu' if
isinstance(device, str) else device.type != 'cpu')) and move the sd[name] =
sd[name].to(device) call out of the warnings.catch_warnings block so both empty
and non-empty tensors end up on the correct device.

---

Nitpick comments:
In `@comfy/utils.py`:
- Line 85: load_safetensors currently uses Python's mmap (mapping =
mmap.mmap(...)) and returns mmap-backed tensors when using the CPU device path;
to guarantee mmap-free tensors on fragile Windows setups, after creating tensors
from the buffer (via torch.frombuffer) in the CPU branch, explicitly detach by
calling .clone() on each tensor (e.g., sd[name] = sd[name].clone()) so returned
tensors are not memory-mapped; update the CPU device path logic in
load_safetensors to clone tensors after torch.frombuffer creation to fully
remove mmap backing.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6cfe0cf and 664069c.

📒 Files selected for processing (1)

comfy/utils.py

rattus128 · 2026-02-25T03:28:05Z

comfy/utils.py

-                sd, metadata = load_safetensors(ckpt)
+            # TODO: Not sure if this is the best way to bypass the mmap issues
+            if DISABLE_MMAP or enables_dynamic_vram():
+                sd, metadata = load_safetensors(ckpt, device)


this seems a little strange, in that load_safetensors is very deliberately an MMAP. The main difference between load_safetensors and the sft package is the MMAP open flags so what you are doing here is a different kind of mmap.

You need to be careful here in that you are adding a path where non-dynamic is using this zero-copy sft loader which is incompatible with the non-dynamic memory pinner.

Try a workflow with --disable-mmap --novram and im suspicious you will see "Pin error"

Yes, there are Pin error. shown, but unfortunately, ComfyUI says Default CPU Allocator cannot allocate after I add tensor.to(device=device, copy=True), and here is process manager comparsion.

No copy tensor:

Copy tensor:

Log: Copy tensor:

got prompt E:\AI\ComfyUI>

Log: No copy tensor

got prompt Found quantization metadata version 1 Using MixedPrecisionOps for text encoder CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16 Requested to load LTXAVTEModel_ Pin error. repeat N times loaded partially; 8902.00 MB usable, 1649.94 MB loaded, 10073.95 MB offloaded, 7252.06 MB buffer reserved, lowvram patches: 0 Error running sage attention: Input tensors must be in dtype of torch.float16 or torch.bfloat16, using pytorch attention instead. Error running sage attention: Input tensors must be in dtype of torch.float16 or torch.bfloat16, using pytorch attention instead. Error running sage attention: Input tensors must be in dtype of torch.float16 or torch.bfloat16, using pytorch attention instead. Error running sage attention: Input tensors must be in dtype of torch.float16 or torch.bfloat16, using pytorch attention instead. Found quantization metadata version 1 Detected mixed precision quantization Using mixed precision operations model weight dtype torch.bfloat16, manual cast: torch.bfloat16 model_type FLUX VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16 no CLIP/text encoder weights in checkpoint, the text encoder model will not be loaded. Requested to load LTXAV Pin error. repeat N times loaded partially; 9551.67 MB usable, 9463.08 MB loaded, 12428.51 MB offloaded, 112.04 MB buffer reserved, lowvram patches: 0 100%|████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:33<00:00, 4.18s/it] Unloaded partially: 976.66 MB freed, 8486.42 MB remains loaded, 112.04 MB buffer reserved, lowvram patches: 0 0 models unloaded. Unloaded partially: 1508.48 MB freed, 6977.94 MB remains loaded, 368.54 MB buffer reserved, lowvram patches: 0 100%|████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:22<00:00, 7.44s/it] Requested to load AudioVAE loaded completely; 2571.74 MB usable, 415.20 MB loaded, full load: True Requested to load VideoVAE 0 models unloaded. loaded partially; 0.00 MB usable, 0.00 MB loaded, 2331.69 MB offloaded, 648.02 MB buffer reserved, lowvram patches: 0 Prompt executed in 95.03 seconds

… alternate MMAP loader is weird)

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy/cli_args.py`:
- Line 170: Rename the CLI flag and its internal attribute from sft to
safetensors for clarity: change the parser.add_argument("--sft-alt-loader", ...)
to use "--safetensors-alt-loader" and update any uses of args.sft_alt_loader to
args.safetensors_alt_loader (notably the reference in comfy/utils.py where
args.sft_alt_loader is read). Ensure the help text is expanded to something
explicit like "Use alternate loader for safetensors (fixes Windows crashes with
large models)" and update any tests or docs that reference the old flag name.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 664069c and 8301dcf.

📒 Files selected for processing (2)

comfy/cli_args.py
comfy/utils.py

🚧 Files skipped from review as they are similar to previous changes (1)

comfy/utils.py

coderabbitai · 2026-02-25T06:52:01Z

comfy/cli_args.py


 parser.add_argument("--mmap-torch-files", action="store_true", help="Use mmap when loading ckpt/pt files.")
 parser.add_argument("--disable-mmap", action="store_true", help="Don't use mmap when loading safetensors.")
+parser.add_argument("--sft-alt-loader", action="store_true", help="Use alternate method to load safetensors.")


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -n "sft_alt_loader|sft-alt-loader|USE_ALT_SFT_LOADER" --type py -A2 -B2

Repository: Comfy-Org/ComfyUI

Length of output: 1160

Help text lacks clarity; sft abbreviation is inconsistent with neighboring flags.

The flag --sft-alt-loader with help text "Use alternate method to load safetensors." is too vague—users encountering Windows crashes with large models won't know this is what they need. Additionally, the abbreviated sft is inconsistent with neighboring flags like --disable-mmap and --mmap-torch-files, which use full words.

Renaming to --safetensors-alt-loader is feasible because this is a newly added flag with no existing users. If renamed, update the attribute reference in comfy/utils.py line 40 from args.sft_alt_loader to args.safetensors_alt_loader.

✏️ Suggested improvement

-parser.add_argument("--sft-alt-loader", action="store_true", help="Use alternate method to load safetensors.") +parser.add_argument("--safetensors-alt-loader", action="store_true", help="Use an alternate (non-mmap) loader for safetensors files. Recommended on Windows when loading large models causes crashes or when --disable-mmap alone is insufficient.")

Then in comfy/utils.py line 40:

-USE_ALT_SFT_LOADER = args.sft_alt_loader +USE_ALT_SFT_LOADER = args.safetensors_alt_loader

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

parser.add_argument("--sft-alt-loader", action="store_true", help="Use alternate method to load safetensors.")

parser.add_argument("--safetensors-alt-loader", action="store_true", help="Use an alternate (non-mmap) loader for safetensors files. Recommended on Windows when loading large models causes crashes or when --disable-mmap alone is insufficient.")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@comfy/cli_args.py` at line 170, Rename the CLI flag and its internal attribute from sft to safetensors for clarity: change the parser.add_argument("--sft-alt-loader", ...) to use "--safetensors-alt-loader" and update any uses of args.sft_alt_loader to args.safetensors_alt_loader (notably the reference in comfy/utils.py where args.sft_alt_loader is read). Ensure the help text is expanded to something explicit like "Use alternate loader for safetensors (fixes Windows crashes with large models)" and update any tests or docs that reference the old flag name.

roj234 requested review from Kosinkadink, comfyanonymous and guill as code owners February 24, 2026 14:58

coderabbitai bot reviewed Feb 24, 2026

View reviewed changes

rattus128 reviewed Feb 25, 2026

View reviewed changes

roj234 added 3 commits February 25, 2026 15:48

avoid WerFault

05f98dc

move tensor to device

8a5b699

add new cli options (as @rattus128 points out use DISABLE_MMAP to use…

8301dcf

… alternate MMAP loader is weird)

roj234 force-pushed the mmap_option branch from f9c5c79 to 8301dcf Compare February 25, 2026 06:48

coderabbitai bot reviewed Feb 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix safetensor cause python crash on some windows platform#12611

Fix safetensor cause python crash on some windows platform#12611
roj234 wants to merge 3 commits intoComfy-Org:masterfrom
roj234:mmap_option

roj234 commented Feb 24, 2026

Uh oh!

coderabbitai bot commented Feb 24, 2026 •

edited

Loading

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

roj234 commented Feb 24, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

rattus128 Feb 25, 2026

Uh oh!

roj234 Feb 25, 2026

Uh oh!

roj234 Feb 25, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	parser.add_argument("--sft-alt-loader", action="store_true", help="Use alternate method to load safetensors.")
	parser.add_argument("--safetensors-alt-loader", action="store_true", help="Use an alternate (non-mmap) loader for safetensors files. Recommended on Windows when loading large models causes crashes or when --disable-mmap alone is insufficient.")

Conversation

roj234 commented Feb 24, 2026

Problem

Solution

Tested with ltx-2-19b-distilled-fp8.safetensors on Python 3.11, safetensors 0.6.2, Windows 10.

Uh oh!

coderabbitai bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

roj234 commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

rattus128 Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

roj234 Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

roj234 Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Tested with `ltx-2-19b-distilled-fp8.safetensors` on Python 3.11, safetensors 0.6.2, Windows 10.

coderabbitai bot commented Feb 24, 2026 •

edited

Loading

roj234 commented Feb 24, 2026 •

edited

Loading