Skip to content

rpc: nicer error message for RPC server crash #14076

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 10, 2025

Conversation

isaac-mcfadyen
Copy link
Contributor

Changes

Currently the RPC error throws a cryptic-looking error if the remote side crashes because the generic GGML_ASSERT macro is used.

This PR just adds a simple, RPC-server-specific macro that has the same behavior as GGML_ASSERT but prints a slightly nicer message.

Note that the error message assumes that the only reason an invalid status would be returned is a RPC server crash or malformed payload. I'm happy to change the message if there are other edge-cases we want to cover.

Before:

/Users/isaac/Documents/llama.cpp/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:764: GGML_ASSERT(status) failed

After:

/Users/isaac/Documents/llama.cpp/llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp:782: Remote RPC server crashed or returned malformed response

Testing

Tested by specifying only the -ctk q4_0 flag (without -ctv q4_0 and without GGML_CUDA_FA_ALL_QUANTS) and running an example prompt.

Accidentally omitting this flag and getting the cryptic error is what motivated this PR.

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Jun 9, 2025
@rgerganov rgerganov merged commit 2bb0467 into ggml-org:master Jun 10, 2025
87 of 88 checks passed
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request Jun 10, 2025
* origin/master:
llama : support GEGLU for jina-bert-v2 (ggml-org#14090)
vulkan: force device 0 in CI (ggml-org#14106)
Fixed spec timings to: accepted/tested instead of accepted/drafted (ggml-org#14104)
sync : ggml
ggml : fix weak alias win32 (whisper/0)
Vulkan: Don't default to CPU device (like llvmpipe), even if no other device is available, to allow fallback to CPU backend (ggml-org#14099)
rpc : nicer error messages for RPC server crash (ggml-org#14076)
sync : ggml
Add in-build ggml::ggml ALIAS library (ggml/1260)
metal : use less stack memory in FA kernel (ggml-org#14088)
kv-cache : fix shift and defrag logic (ggml-org#14081)
llama : allow building all tests on windows when not using shared libs (ggml-org#13980)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants