-
Notifications
You must be signed in to change notification settings - Fork 11k
Open
Labels
Potential BugUser is reporting a bug. This should be tested.User is reporting a bug. This should be tested.
Description
Custom Node Testing
- I have tried disabling custom nodes and the issue persists (see how to disable custom nodes if you need help)
Expected Behavior
GGUF Qwen models (e.g., Q4_K_M) should run with the --fast argument and not crash.
Actual Behavior
Even smaller GGUF Qwen models (e.g., Q4_K_M) that have run previously now produce the following error when run with the --fast argument or --fast pinned_memory argument:
KSampler
CUDA error: invalid argument
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
I'm aware the --fast argument "enables some untested and potentially quality deteriorating optimizations". The culprit appears to be the pinned_memory optimization.
Steps to Reproduce
Launch ComfyUI with --fast or --fast pinned_memory argument. Run a simple workflow that includes a GGUF Unet loader node. Notice the (likely) CUDA crash.
Debug Logs
--Other
No response
Metadata
Metadata
Assignees
Labels
Potential BugUser is reporting a bug. This should be tested.User is reporting a bug. This should be tested.