-
I often play games while having a1111 still in the background. It eats ~6g of my vram while loaded with whichever checkpoint. So is there a way to eject the loaded model somehow, without killing the process? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 4 replies
-
Settings > Actions > Unload SD checkpoint to RAM |
Beta Was this translation helpful? Give feedback.
-
unload checkpoint to RAM, it will eats up RAM if we change model frequently. How to unload completely from RAM and VRAM? |
Beta Was this translation helpful? Give feedback.
-
Ollama framework has a really handy environment and API accessible variable: OLLAMA_KEEP_ALIVE=[# of seconds] | [xM] | 0 I think it's mostly used for people who want the last loaded chat model to stay loaded longer. But I use it set to zero to keep the GPU VRAM as empty as possible as soon as possible. This is because I have many users that mostly use the GPU for chat and occasionally for Text-to-speech and SD image creation - loading up the GPU VRAM. Unfortunately SDWeb keeps its last model loaded indefinately. It would be great if SDWeb had a similar Keep Alive option to let us decide how long to keep the last model loaded. |
Beta Was this translation helpful? Give feedback.
Settings > Actions > Unload SD checkpoint to RAM