Skip to content

Conversation

@jeffbolznv
Copy link
Collaborator

For #17605, though I'm not sure whether it'll help.

@0cc4m
Copy link
Collaborator

0cc4m commented Nov 30, 2025

Even if it does, I don't think it's always preferable to keep models in VRAM, over other data. So at the very least we'd have to make it configurable.

@netrunnereve do you know if the extension makes a difference for RADV memory eviction behaviour? Maybe it's a way to keep models loaded without the RADV-side flag.

@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Nov 30, 2025
@netrunnereve
Copy link
Collaborator

Even if it does, I don't think it's always preferable to keep models in VRAM, over other data. So at the very least we'd have to make it configurable.

@netrunnereve do you know if the extension makes a difference for RADV memory eviction behaviour? Maybe it's a way to keep models loaded without the RADV-side flag.

I'm pretty sure RADV doesn't care about this extension and just handles memory like it usually does. By that I mean that it tries to put stuff in vram if possible (note that doesn't always happen) and if memory gets swapped out of vram it doesn't know how to bring it back.

The only way to deal with this is to use my nogttspill flag and #17605 is literally a textbook example of why we need it. I suppose we could ask Mesa to see if they're interested in supporting VK_EXT_memory_priority though and basically have it disable GTT allocations if set high enough.

@jeffbolznv
Copy link
Collaborator Author

I'm not too surprised this didn't help, and am OK with abandoning it. This does get hooked up to WDDM priorities on windows and it might help more there.

@netrunnereve
Copy link
Collaborator

I'm not too surprised this didn't help, and am OK with abandoning it. This does get hooked up to WDDM priorities on windows and it might help more there.

Personally I don't mind this as long as it's optional and disabled by default. Like you said there are other systems which can probably make use of this.

@0cc4m
Copy link
Collaborator

0cc4m commented Dec 2, 2025

Since you already implemented it, let's just keep it disabled by default and enable with an environment variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants