-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Closed
Labels
Description
Name and Version
-dev CUDA0 mostly use dev 0 but allocate 496MiB on dev 1

-dev CUDA1 only use dev 1 as expected

commit e60f01, driver 575.64.03, cuda: 12.8
./llama-server --list-devices
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 CUDA devices:
Device 0: NVIDIA GeForce RTX 5090, compute capability 12.0, VMM: yes
Device 1: NVIDIA GeForce RTX 5090, compute capability 12.0, VMM: yes
Available devices:
CUDA0: NVIDIA GeForce RTX 5090 (32109 MiB, 3752 MiB free)
CUDA1: NVIDIA GeForce RTX 5090 (32109 MiB, 3251 MiB free)
Operating systems
Linux l1 6.11.0-26-generic #26~24.04.1-Ubuntu