-
Hi all, Recently, I just have read the overview of the RPC feature onto the llama.cpp, and I may have one wrong understanding of how the llama.cpp rpc feature is implemented, do please correct me. Assume I have 3 devices, A,B and C all with GPUs which llama.cpp supports to run, A as the main host, other 2 as the part of hosts nearby with each other. According to my current understanding, I shall only install LLM on A to run llama-cli rather than B and C, then B and C would work for the llm I installed on A (btw, A,B and C would all provide AI power for the lllm). Do please let me know if that is correct or definitely wrong, thank you! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Your understanding is correct, you should start |
Beta Was this translation helpful? Give feedback.
Your understanding is correct, you should start
rpc-server
onB
andC
and refer to them when runningllama-cli
on hostA