-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
[Frontend] Add /collective_rpc API endpoint
#23075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new /collective_rpc API endpoint. The changes include adding an abstract collective_rpc method to the EngineClient protocol and implementing the corresponding endpoint in the API server.
My review has identified a few issues:
- A critical issue where the new abstract method
collective_rpcappears to be unimplemented in the concrete engine classes provided in the context, which would lead to runtime errors. - A high-severity issue in the API endpoint implementation related to unsafe access to the request body, which could cause unhandled exceptions.
- Another high-severity issue with how the results from the RPC call are serialized into the JSON response, which would lead to incorrect output for structured data.
Please see the detailed comments for suggestions on how to address these points.
| elif isinstance(result, pydantic.BaseModel): | ||
| response.append(result.model_dump()) | ||
| else: | ||
| response.append(str(result)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using str(result) for serialization is problematic for structured data types like dictionaries or lists. It converts them into a string representation (e.g., "{'a': 1}") instead of a proper JSON object. This can lead to unexpected behavior on the client side. It's better to append the result directly and let JSONResponse handle the serialization. This will correctly serialize JSON-compatible types and raise an error for non-serializable types, which is more explicit.
| response.append(str(result)) | |
| response.append(result) |
|
I think we should make the |
This endpoint is currently gated behind |
|
Ah I wasn't aware of that gate, that sounds sufficient to me. |
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
DarkLight1337
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Please fix the failing tests though |
|
@DarkLight1337 The full version of entrypoints-test-api-server passed (only the fastcheck version failed) before your rebase. Same goes for openai-api-correctness. Probably safe to force merge? |
|
I'll just retry the tests |
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Signed-off-by: Duncan Moss <djm.moss@gmail.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Signed-off-by: 22quinn <33176974+22quinn@users.noreply.github.com>
Purpose
Currently most RL frameworks interact with vLLM via
LLMorAsyncLLMclass, like the example here:vllm/examples/offline_inference/rlhf.py
Line 112 in 8ea0c27
In order for RL framework to interact with vLLM via the API server, a
/collective_rpcendpoint is needed.This endpoint accepts a json body with the following fields. For security reason, we do not handle (de-)serialization but accepts only strings.
method: strargs: list[str]kwargs: dict[str, str]timeout: Optional[float]Test Plan
PYTHONPATH="/data/users/user/gitrepos/vllm" pytest tests/entrypoints/openai/test_collective_rpc.py
Test Result
(Optional) Documentation Update
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.