Just One Confusion About Llama.cpp RPC Feature #11800

xuanzhec · 2025-02-11T06:34:58Z

xuanzhec
Feb 11, 2025

Hi all,

Recently, I just have read the overview of the RPC feature onto the llama.cpp, and I may have one wrong understanding of how the llama.cpp rpc feature is implemented, do please correct me.

Assume I have 3 devices, A,B and C all with GPUs which llama.cpp supports to run, A as the main host, other 2 as the part of hosts nearby with each other. According to my current understanding, I shall only install LLM on A to run llama-cli rather than B and C, then B and C would work for the llm I installed on A (btw, A,B and C would all provide AI power for the lllm).

Do please let me know if that is correct or definitely wrong, thank you!

Answered by rgerganov

Feb 11, 2025

Your understanding is correct, you should start rpc-server on B and C and refer to them when running llama-cli on host A

View full answer

rgerganov · 2025-02-11T08:49:51Z

rgerganov
Feb 11, 2025
Collaborator

Your understanding is correct, you should start rpc-server on B and C and refer to them when running llama-cli on host A

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Just One Confusion About Llama.cpp RPC Feature #11800

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Just One Confusion About Llama.cpp RPC Feature #11800

xuanzhec Feb 11, 2025

Replies: 1 comment

rgerganov Feb 11, 2025 Collaborator

xuanzhec
Feb 11, 2025

rgerganov
Feb 11, 2025
Collaborator