vulkan: Reduce temporary memory usage for TOP_K #17623

jeffbolznv · 2025-11-30T15:52:47Z

Compute row size for the temp buffer based on the output of the first pass.
Update shader addressing math to use the output row size
Pass the output row size as "ncols_output", what used to be "ncols_output" is now "k"

For the common case of K=40 and src0=(200000,1,1,1), this reduces the temporary buffer from about 3.2MB to 500KB.

- Compute row size for the temp buffer based on the output of the first pass. - Update shader addressing math to use the output row size - Pass the output row size as "ncols_output", what used to be "ncols_output" is now "k" For the common case of K=40 and src0=(200000,1,1,1), this reduces the temporary buffer from about 3.2MB to 500KB.

jeffbolznv requested a review from 0cc4m as a code owner November 30, 2025 15:52

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Nov 30, 2025

loci-dev mentioned this pull request Nov 30, 2025

UPSTREAM PR #17623: vulkan: Reduce temporary memory usage for TOP_K auroralabs-loci/llama.cpp#374

Open

jeffbolznv mentioned this pull request Dec 1, 2025

vulkan: fix top_k bug when there are ties in the input #17659

Open

loci-dev mentioned this pull request Dec 1, 2025

UPSTREAM PR #17659: vulkan: fix top_k bug when there are ties in the input auroralabs-loci/llama.cpp#391

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan: Reduce temporary memory usage for TOP_K #17623

vulkan: Reduce temporary memory usage for TOP_K #17623

jeffbolznv commented Nov 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vulkan: Reduce temporary memory usage for TOP_K #17623

Are you sure you want to change the base?

vulkan: Reduce temporary memory usage for TOP_K #17623

Conversation

jeffbolznv commented Nov 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant