-
-
Notifications
You must be signed in to change notification settings - Fork 11.2k
[misc] Spawn process when vLLM is used as Ray actor #14768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
vllm/executor/ray_utils.py
Outdated
| "support ray.") | ||
|
|
||
| # Create placement group for worker processes | ||
| current_placement_group = ray.util.get_current_placement_group() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You won't be able to get the current placement group with spawn right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, we will need to pass via PG_NAME, as in #14410
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to not introduce more environment variables if possible. Why can't we just pass the placement group?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
env vars are automatically inheritted, even if the process is spawned.
if we pass placement group as argument, then we need to explicitly pass it somewhere, and it might vary depending on the entrypoint function of the new process.
I'm fine with adding new env vars, as long as they are well scoped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one drawback of this approach, is that we need the placement group to be named and in a namespace, while in general it is not the case for placement groups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to get some ids from the placement group, and then directly construct placement group from the ids? it seems to be a pretty common use case for ray, and i think it should have some solutions other than named placement group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment makes sense, that would be ideal. One the other hand, the fundamental reason we needed named placement group and namespace is we are in a new ray job after the spawn: the usage of namespace and name is for isolation purpose. And the way to tackle that is to use Ray to manage process resources, rather than using Python multi-processing. That will be a somewhat longer term solution, as it needs quite some changes in vLLM. So unfortunately using ID and not name/namespace is not really viable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what @comaniac did in #14705 , is already implementing this kind of functionality i think, it passes the placement group object across processes. and we should be able to do it via environment variable so that we don't need to pass it to specific functions.
for example, inside the _check_multiproc_method function, if you detect ray placement group, not only set VLLM_WORKER_MULTIPROC_METHOD, but also set VLLM_RAY_PG_HANDLE to contain some information from the placement group.
and then, in ray_utils.py, if VLLM_RAY_PG_HANDLE is set, you parse the information and create the placement group object again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One issue is we may not be able to pass the placement group object via env var, especially when the placement group is not created by users. For example when using Ray Serve to launch vllm, the placement group is created by Ray Serve, and it won't pass placement group it created to an env var. Instead, "get_current_placement_group()" is a more desire pattern of sharing placement group
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also to Kaichao's comment:
Placement group is a non-string object, I don't think we can pass it via environment variables? Did you mean to serialize/pickle it first?
Also, by VLLM_RAY_PG_HANDLE, did you mean the imaginary "ID"? There is no PG handle. Ray API only supports namespace + name: that's the only "handle" supported.
By default
VLLM_WORKER_MULTIPROC_METHODis set tofork. However, forking a Ray actor will cause undefined behavior, this leads to hangs in placement group methods using V1.This PR fixes the issue by detecting if vLLM is used as a Ray actor, and if so, uses
spawninstead offorkfor new process creation.This PR also skips ray initialization if ray is already initialized.
Tested with
VLLM_USE_V1=1 python /home/ubuntu/vllm/examples/offline_inference/rlhf.py, server started and served requests: