Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : add RPC backend #6829

Merged
merged 19 commits into from
May 14, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
implement llama_max_devices() for RPC
  • Loading branch information
rgerganov committed May 14, 2024
commit 654c1cc2796a1c6181e37e69058747df04781d7b
2 changes: 2 additions & 0 deletions ggml-rpc.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ enum rpc_cmd {
GRAPH_COMPUTE,
};

#define GGML_RPC_MAX_SERVERS 16

// backend API
GGML_API GGML_CALL ggml_backend_t ggml_backend_rpc_init(const std::string & endpoint);
GGML_API GGML_CALL bool ggml_backend_is_rpc(ggml_backend_t backend);
Expand Down
4 changes: 3 additions & 1 deletion llama.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15480,7 +15480,9 @@ struct llama_model_quantize_params llama_model_quantize_default_params() {
}

size_t llama_max_devices(void) {
#if defined(GGML_USE_METAL)
#if defined(GGML_USE_RPC)
return GGML_RPC_MAX_SERVERS;
#elif defined(GGML_USE_METAL)
return 1;
#elif defined(GGML_USE_CUDA)
return GGML_CUDA_MAX_DEVICES;
Expand Down