Replies: 2 comments
-
Try this |
Beta Was this translation helpful? Give feedback.
0 replies
-
(ubuntu 24.04) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I have setup llama-server successfully so that it consumes my RTX 4000 via CUDA (v 11), both via docker and running locally.
But when I want to use the python-bindings (llama-cpp-python), it seems to not utilize the GPU at all, doing everything with CPU only which consumes much time.
I have installed the library with
CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python
What else do I need in order to enable GPU-support?
Code:
Beta Was this translation helpful? Give feedback.
All reactions