vulkan: set all memory allocations to high priority #17624

jeffbolznv · 2025-11-30T16:29:06Z

For #17605, though I'm not sure whether it'll help.

0cc4m · 2025-11-30T17:43:22Z

Even if it does, I don't think it's always preferable to keep models in VRAM, over other data. So at the very least we'd have to make it configurable.

@netrunnereve do you know if the extension makes a difference for RADV memory eviction behaviour? Maybe it's a way to keep models loaded without the RADV-side flag.

netrunnereve · 2025-12-01T03:27:06Z

Even if it does, I don't think it's always preferable to keep models in VRAM, over other data. So at the very least we'd have to make it configurable.

@netrunnereve do you know if the extension makes a difference for RADV memory eviction behaviour? Maybe it's a way to keep models loaded without the RADV-side flag.

I'm pretty sure RADV doesn't care about this extension and just handles memory like it usually does. By that I mean that it tries to put stuff in vram if possible (note that doesn't always happen) and if memory gets swapped out of vram it doesn't know how to bring it back.

The only way to deal with this is to use my nogttspill flag and #17605 is literally a textbook example of why we need it. I suppose we could ask Mesa to see if they're interested in supporting VK_EXT_memory_priority though and basically have it disable GTT allocations if set high enough.

jeffbolznv · 2025-12-01T03:43:07Z

I'm not too surprised this didn't help, and am OK with abandoning it. This does get hooked up to WDDM priorities on windows and it might help more there.

netrunnereve · 2025-12-01T15:29:21Z

I'm not too surprised this didn't help, and am OK with abandoning it. This does get hooked up to WDDM priorities on windows and it might help more there.

Personally I don't mind this as long as it's optional and disabled by default. Like you said there are other systems which can probably make use of this.

0cc4m · 2025-12-02T09:50:00Z

Since you already implemented it, let's just keep it disabled by default and enable with an environment variable.

vulkan: set all memory allocations to high priority

b33c590

jeffbolznv requested a review from 0cc4m as a code owner November 30, 2025 16:29

jeffbolznv marked this pull request as draft November 30, 2025 16:29

jeffbolznv mentioned this pull request Nov 30, 2025

Feature Request: Add VK_EXT_memory_priority support for model allocations (Vulkan backend) #17605

Open

4 tasks

loci-dev mentioned this pull request Nov 30, 2025

UPSTREAM PR #17624: vulkan: set all memory allocations to high priority auroralabs-loci/llama.cpp#373

Open

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Nov 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan: set all memory allocations to high priority #17624

vulkan: set all memory allocations to high priority #17624

jeffbolznv commented Nov 30, 2025

Uh oh!

0cc4m commented Nov 30, 2025

Uh oh!

netrunnereve commented Dec 1, 2025

Uh oh!

jeffbolznv commented Dec 1, 2025

Uh oh!

netrunnereve commented Dec 1, 2025

Uh oh!

0cc4m commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vulkan: set all memory allocations to high priority #17624

Are you sure you want to change the base?

vulkan: set all memory allocations to high priority #17624

Conversation

jeffbolznv commented Nov 30, 2025

Uh oh!

0cc4m commented Nov 30, 2025

Uh oh!

netrunnereve commented Dec 1, 2025

Uh oh!

jeffbolznv commented Dec 1, 2025

Uh oh!

netrunnereve commented Dec 1, 2025

Uh oh!

0cc4m commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants