Misc. bug: Drastic drop in Vulkan performance somewhere between builds b4916 (was fast) and b4932 (roasting CPU and seems to barely use GPU)

### Name and Version

ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 6600 (AMD proprietary driver) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 32768 | matrix cores: none
version: 4932 (9ffcc9e3)
built with MSVC 19.43.34808.0 for x64

### Operating systems

Windows

### Which llama.cpp modules do you know to be affected?

llama-server

### Command line

```shell
I just use this .bat to start server and then use just Llama-server's webUI

echo Running Mistral small 3.1 2503 24B 12288 context
llama-server.exe ^
--model "D:\LLMs\mistralai_Mistral-Small-3.1-24B-Instruct-2503-Q4_K_M.gguf" ^
--gpu-layers 14 ^
--ctx-size 12288 ^
--temp 0.2
```

### Problem description & steps to reproduce

Going from release **b4916** (llama-b4916-bin-win-vulkan-x64) to **b4932** (llama-b4932-bin-win-vulkan-x64) I noticed that model (not just this Mistral) load partially into VRAM just like before but **inference is significantly slower and CPU is being roasted**. This never happened with multiple previous releases.
Checking VRAM load in task manager is showing that **model seems to be properly loaded** (exactly like before) but performance is almost like its not using GPU at all. Literally was solid and now is terrible.

### First Bad Commit

_No response_

### Relevant log output

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: Drastic drop in Vulkan performance somewhere between builds b4916 (was fast) and b4932 (roasting CPU and seems to barely use GPU) #12490

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: Drastic drop in Vulkan performance somewhere between builds b4916 (was fast) and b4932 (roasting CPU and seems to barely use GPU) #12490

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions