Report on middle point about unexpected VRAM Problem from yesterday morning for you guys - _ -;;;

### Expected Behavior

First issues : https://github.com/comfyanonymous/ComfyUI/issues/7390


So, I'm gonna keep talking about this.

Sorry in advance, English is not my mother language but I believe you guys all understand what I talk to you bros.

ok, let's go.


After installing Windows OS, I installed basic user programs including drivers  and then I installed comfyui and tried testing it.


First, basic comfyui test (you know that stupid look glass bottle.)

No problem~ (fortunately) good...?

But!!! This was my big mistake!!!

I shoudn't have trust comfyui node process till I get see all my workflow test result!

Another my full functional WF is not work!

Summary : VRAM still have no-clean problem after one process.

So I recorded this history this below. Just see this below for your fun if you are free.


When I doing nothing with my computer I tested CMD : Nvidia-smi command to see VRAM state

Ah, my vc is 3080ti

![Image](https://github.com/user-attachments/assets/acafa9a2-4fdb-4c35-b145-2b3c5955abd2)

ok, looks good as you know. Vram is almost nothing~ less 2% charged.

So I turn on the Comfyui.

![Image](https://github.com/user-attachments/assets/96d333dc-df56-48bc-90ac-b83425f28a7b)

Vram is good. No problem~ (yet - _ -;;;)


So I did run que button. ok let's go~

![Image](https://github.com/user-attachments/assets/caad00a8-916f-47d3-8c2f-90a50f1c078e)

- _ -???

as you see... something goes wrong.

There is a red-line I checked to show seperated cmd message from 'booting endup' and 'new task started'

Vram was 12% when it begun and it always worked without any problem.

but... now it's starting with that weird message

"
K:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\cuda\memory.py:391: FutureWarning: torch.cuda.reset_max_memory_allocated now calls torch.cuda.reset_peak_memory_stats, which resets /all/ peak memory stats.  warnings.warn( ..........

"

- _ -???

8months I used and tested this stupid comfyui to learn and know AI image and video generate and

I bet my neck to "I haven's seen that torch memory warning in my life" - _ -;;;;;;;;;;;;;

ok, good. It wasn't  first time seeing 'hahaha you sir~ idiot got a new problem' from comfyui alert.



![Image](https://github.com/user-attachments/assets/649d7401-aabd-4e7b-9e0d-efcb6b24b367)


And finally, this (first) try was ended in 84sec.


If this is end-up with just this and no problem, I wouldn't waste my time to post this stupid broken English here.



![Image](https://github.com/user-attachments/assets/62dcc9ee-f04f-495d-ac63-24a641b72455)

This is never gone whatever I tried.

Vram clean, purge vram, cache clean etc even I installed RAM cache clean window app but I cannot remove.


![Image](https://github.com/user-attachments/assets/0ac414d4-0fb7-4fc0-85fb-14d36e2d05f7)

Serious?

ye I know sometimes that VRAM state is keeping previous task file so it does look like that but it always back to 12~15% empty mode wheneve I try new task. literally, set default.

but... Now it's always 50% full of memory and make another job slow in very long time.


One sure thing is

new installed comfyui have no problem.

torch module always show memory problem errors.

and I don't know;;;;;;


Summary is...


Please helpuuuuuuuuu me~~~~~~~~~~ ^ ^;;;;;;;;;;;;;;;;;;;;;;;;;;;;




### Actual Behavior

Please helpuuuuuuuuu me~~~~~~~~~~ ^ ^;;;;;;;;;;;;;;;;;;;;;;;;;;;;

### Steps to Reproduce

Please helpuuuuuuuuu me~~~~~~~~~~ ^ ^;;;;;;;;;;;;;;;;;;;;;;;;;;;;

### Debug Logs

```powershell
got prompt
K:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\cuda\memory.py:391: FutureWarning: torch.cuda.reset_max_memory_allocated now calls torch.cuda.reset_peak_memory_stats, which resets /all/ peak memory stats.
  warnings.warn(
model weight dtype torch.float16, manual cast: None
model_type EPS
Using xformers attention in VAE
Using xformers attention in VAE
VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
end_vram - start_vram: 0 - 0 = 0
#62 [CheckpointLoaderSimple]: 20.39s - vram 0b
end_vram - start_vram: 0 - 0 = 0
#100 [Anything Everywhere]: 0.00s - vram 0b
end_vram - start_vram: 0 - 0 = 0
#180 [Power Lora Loader (rgthree)]: 3.60s - vram 0b
end_vram - start_vram: 0 - 0 = 0
#99 [Anything Everywhere]: 0.00s - vram 0b
end_vram - start_vram: 0 - 0 = 0
#98 [Anything Everywhere]: 0.00s - vram 0b
end_vram - start_vram: 0 - 0 = 0
#21 [ttN text]: 0.00s - vram 0b
end_vram - start_vram: 0 - 0 = 0
#20 [ttN text]: 0.00s - vram 0b
end_vram - start_vram: 0 - 0 = 0
#18 [ttN text]: 0.00s - vram 0b
=============================================================
Portrait Master positive prompt:
portrait of a girl. looking at viewer. background is white. close up eye <lora:fix\background\white_1_0.safetensors:1.0>, a plain white background, highly detailed, (woman 18-years-old:1.5), (almond eyes shape:1.05), (wavy cut hairstyle:1.05), (long:1.05), (professional photo, balanced photo, balanced exposure:1.2)

Portrait Master negative prompt:
(watermark:1.2), (text:1.2), (logo:1.2), (3d render:1.2), drawing, painting, crayon, cartoon, painting, illustration, (worst quality, low quality, normal quality:2) busy background, dark background, patterned background, (shinny skin, shiny skin, reflections on the skin, skin reflections:1.35)
=============================================================
end_vram - start_vram: 0 - 0 = 0
#16 [PortraitMaster]: 0.00s - vram 0b
end_vram - start_vram: 0 - 0 = 0
#17 [easy showAnything]: 0.00s - vram 0b
end_vram - start_vram: 0 - 0 = 0
#178 [SeargeIntegerPair]: 0.00s - vram 0b
end_vram - start_vram: 0 - 0 = 0
#12 [EmptyLatentImage]: 0.00s - vram 0b
end_vram - start_vram: 0 - 0 = 0
#90 [LoadImage]: 0.05s - vram 0b
end_vram - start_vram: 0 - 0 = 0
#192:0 [Load Image Batch]: 0.00s - vram 0b
end_vram - start_vram: 0 - 0 = 0
#192:1 [easy imageSwitch]: 0.00s - vram 0b
Requested to load SDXLClipModel
loaded completely 9652.8 1560.802734375 True
end_vram - start_vram: 1679529340 - 0 = 1679529340
#10 [CLIPTextEncode]: 4.45s - vram 1679529340b
end_vram - start_vram: 1679529340 - 1645139976 = 34389364
#9 [CLIPTextEncode]: 0.12s - vram 34389364b
end_vram - start_vram: 1645139976 - 1645139976 = 0
#194:0 [LoadImage]: 0.01s - vram 0b
end_vram - start_vram: 1645139976 - 1645139976 = 0
#194:1 [Load Image Batch]: 0.00s - vram 0b
end_vram - start_vram: 1645139976 - 1645139976 = 0
#194:2 [easy imageSwitch]: 0.00s - vram 0b
end_vram - start_vram: 1645139976 - 1645139976 = 0
#193:0 [LoadImage]: 0.02s - vram 0b
end_vram - start_vram: 1645139976 - 1645139976 = 0
#193:1 [Load Image Batch]: 0.00s - vram 0b
end_vram - start_vram: 1645139976 - 1645139976 = 0
#193:2 [easy imageSwitch]: 0.00s - vram 0b
end_vram - start_vram: 1645139976 - 1645139976 = 0
#7 [ImageBatch]: 0.00s - vram 0b
end_vram - start_vram: 1645139976 - 1645139976 = 0
#94 [easy imageSwitch]: 0.00s - vram 0b
end_vram - start_vram: 1645139976 - 1645139976 = 0
#4 [ControlNetLoader]: 6.97s - vram 0b
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1', 'sdpa_kernel': '0', 'fuse_conv_bias': '0'}, 'CPUExecutionProvider': {}}
find model: K:\ComfyUI_windows_portable\ComfyUI\models\insightface\models\antelopev2\1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1', 'sdpa_kernel': '0', 'fuse_conv_bias': '0'}, 'CPUExecutionProvider': {}}
find model: K:\ComfyUI_windows_portable\ComfyUI\models\insightface\models\antelopev2\2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1', 'sdpa_kernel': '0', 'fuse_conv_bias': '0'}, 'CPUExecutionProvider': {}}
find model: K:\ComfyUI_windows_portable\ComfyUI\models\insightface\models\antelopev2\genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1', 'sdpa_kernel': '0', 'fuse_conv_bias': '0'}, 'CPUExecutionProvider': {}}
find model: K:\ComfyUI_windows_portable\ComfyUI\models\insightface\models\antelopev2\glintr100.onnx recognition ['None', 3, 112, 112] 127.5 127.5
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CUDAExecutionProvider': {'device_id': '0', 'has_user_compute_stream': '0', 'cudnn_conv1d_pad_to_nc1d': '0', 'user_compute_stream': '0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'enable_cuda_graph': '0', 'gpu_external_free': '0', 'gpu_external_empty_cache': '0', 'arena_extend_strategy': 'kNextPowerOfTwo', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'do_copy_in_default_stream': '1', 'cudnn_conv_use_max_workspace': '1', 'tunable_op_enable': '0', 'tunable_op_tuning_enable': '0', 'tunable_op_max_tuning_duration_ms': '0', 'enable_skip_layer_norm_strict_mode': '0', 'prefer_nhwc': '0', 'use_ep_level_unified_stream': '0', 'use_tf32': '1', 'sdpa_kernel': '0', 'fuse_conv_bias': '0'}, 'CPUExecutionProvider': {}}
find model: K:\ComfyUI_windows_portable\ComfyUI\models\insightface\models\antelopev2\scrfd_10g_bnkps.onnx detection [1, 3, '?', '?'] 127.5 128.0
set det-size: (640, 640)
end_vram - start_vram: 1645139976 - 1645139976 = 0
#5 [InstantIDFaceAnalysis]: 5.26s - vram 0b
end_vram - start_vram: 1645139976 - 1645139976 = 0
#3 [InstantIDModelLoader]: 5.54s - vram 0b
K:\ComfyUI_windows_portable\python_embeded\Lib\site-packages\insightface\utils\transform.py:68: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`.
  P = np.linalg.lstsq(X_homo, Y)[0].T # Affine matrix. 3 x 4
INFO: InsightFace detection resolution lowered to (448, 448).
INFO: InsightFace detection resolution lowered to (448, 448).
end_vram - start_vram: 2491896840 - 1645139976 = 846756864
#6 [ApplyInstantID]: 1.52s - vram 846756864b
end_vram - start_vram: 2490802696 - 2490802696 = 0
#199 [JWFloat]: 0.00s - vram 0b
end_vram - start_vram: 2490802696 - 2490802696 = 0
#198 [Sampler Scheduler Settings (JPS)]: 0.00s - vram 0b
end_vram - start_vram: 2490802696 - 2490802696 = 0
#203 [easy float]: 0.00s - vram 0b
end_vram - start_vram: 2490802696 - 2490802696 = 0
#202 [easy int]: 0.00s - vram 0b
Requested to load SDXL
Requested to load ControlNet
loaded completely 7626.188103485108 4897.0483474731445 True
loaded completely 2729.091843414307 2386.120147705078 True
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:11<00:00,  1.70it/s]
end_vram - start_vram: 9053826552 - 2490802696 = 6563023856
#11 [KSampler]: 23.94s - vram 6563023856b
end_vram - start_vram: 8507551936 - 8507551936 = 0
#103 [LatentUpscale]: 0.00s - vram 0b
Requested to load SDXL
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:10<00:00,  1.82it/s]
end_vram - start_vram: 9070143992 - 8507551936 = 562592056
#102 [KSampler]: 11.05s - vram 562592056b
Requested to load AutoencoderKL
loaded completely 299.7635269165039 159.55708122253418 True
end_vram - start_vram: 8523915456 - 8523915456 = 0
#13 [VAEDecode]: 1.33s - vram 0b
end_vram - start_vram: 3494359654 - 3494359654 = 0
#119 [LoadImage]: 0.01s - vram 0b
end_vram - start_vram: 3494359654 - 3494359654 = 0
#158:0 [ColorMatch]: 0.25s - vram 0b
# 😺dzNodes: LayerStyle -> Brightness Contrast V2 Processed 1 image(s).
end_vram - start_vram: 3494359654 - 3494359654 = 0
#158:1 [LayerColor: BrightnessContrastV2]: 0.04s - vram 0b
end_vram - start_vram: 3561321010 - 3494359654 = 66961356
#158:2 [ImageSharpen]: 0.02s - vram 66961356b
end_vram - start_vram: 3494359654 - 3494359654 = 0
#158:3 [ColorAdjust(FaceParsing)]: 0.01s - vram 0b
end_vram - start_vram: 3494359654 - 3494359654 = 0
#14 [PreviewImage]: 0.11s - vram 0b
end_vram - start_vram: 3494359654 - 3494359654 = 0
#82 [SaveImage]: 0.13s - vram 0b
Prompt executed in 84.92 seconds
[LogConsole] client [211a37f61f034e1aa507d14511128d35], console [a651a6c5-46d6-4ac6-9f27-abdacc82c3bb], connected
[LogConsole] client [211a37f61f034e1aa507d14511128d35], console [fa5e287e-9b7c-4c78-95f9-de88357328c9], disconnected
```

### Other

Please helpuuuuuuuuu me~~~~~~~~~~ ^ ^;;;;;;;;;;;;;;;;;;;;;;;;;;;;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Report on middle point about unexpected VRAM Problem from yesterday morning for you guys - _ -;;; #7405

Expected Behavior

Actual Behavior

Steps to Reproduce

Debug Logs

Other

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Report on middle point about unexpected VRAM Problem from yesterday morning for you guys - _ -;;; #7405

Description

Expected Behavior

Actual Behavior

Steps to Reproduce

Debug Logs

Other

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions