Skip to content

[Regression] Vulkan v1.52.0 does not use GPU/VRAM on gfx1151 (Ryzen AI MAX+ 395), x3 worse performance #1048

@IgnatBeresnev

Description

@IgnatBeresnev

Which version of LM Studio?
LM Studio 0.3.27

Which operating system?
I use arch btw

What is the bug?
Vulkan v1.50.2 uses VRAM to load models as expected. When loading gpt-oss 120b, I can clearly see that 60G of VRAM gets allocated, plus I can see that the GPU load is ~50% when the text is being generated (use btop).

Vulkan v1.52.0 doesn't use VRAM, it uses RAM + swap to load the model, and then it uses the CPU for compute. VRAM usage is at 1/96GB, GPU usage is at 1-2%.

This can be reproduced consistently just by switching the engine to 1.50.2 and back, you can check VRAM usage with btop, radeontop or amdgpu_top --smi.

I guess this bug is also present in v1.51.0 as per #1041

To Reproduce
Steps to reproduce the behavior:

  1. Set the engine to Vulkan v1.50.2, load any model, make sure VRAM is used (commands above), generate some text and remember TPS.
  2. Set the engine to Vulkan v1.52.0, load the same model, re-generate the response. Observe x3 worse performance, see that VRAM is not used and that the CPU load is higher than before.

TLDR fix

Set the engine to Vulkan v1.50.2

Metadata

Metadata

Assignees

Labels

fixed-in-next-updateWill work in the next version of the software

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions