-
Notifications
You must be signed in to change notification settings - Fork 6.9k
[Serve.llm] Adapt vLLM change, use enable_log_requests if disable_log_requests is missing
#55336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Serve.llm] Adapt vLLM change, use enable_log_requests if disable_log_requests is missing
#55336
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @lk-chen, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request addresses a breaking change in the vLLM library's API related to request logging. I've updated the vLLM model initialization logic within Ray's LLM module to gracefully handle the transition from a disable_log_requests parameter to an enable_log_requests parameter, ensuring compatibility with newer vLLM versions.
Highlights
- vLLM API Adaptation: I've implemented a compatibility layer for vLLM's logging configuration, specifically addressing a change from disable_log_requests to enable_log_requests.
- Conditional Logic: The code now checks for the presence of disable_log_requests on vllm.engine.arg_utils.AsyncEngineArgs.
- Parameter Conversion: If disable_log_requests is not found, the existing disable_log_requests value is popped from engine_kwargs and its boolean inverse is assigned to enable_log_requests.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm pending release tests - thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to adapt to a breaking change in the vLLM API, where disable_log_requests was renamed to enable_log_requests. The implementation correctly handles the backward compatibility. However, I've identified a bug where a user-provided enable_log_requests value would be overwritten by the default logic. I have provided a review comment with a suggested fix to address this issue, ensuring that user configurations are respected.
| disable_log_requests = engine_kwargs.pop("disable_log_requests") | ||
| engine_kwargs["enable_log_requests"] = not disable_log_requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic has a flaw that can cause a user's enable_log_requests setting to be overwritten. The issue stems from interaction with the preceding code (lines 96-100) which sets a default for disable_log_requests. This block then unconditionally uses that value to set enable_log_requests, overriding any user-provided value for the latter.
To fix this, you should only set enable_log_requests if it hasn't already been provided by the user.
| disable_log_requests = engine_kwargs.pop("disable_log_requests") | |
| engine_kwargs["enable_log_requests"] = not disable_log_requests | |
| disable_log_requests = engine_kwargs.pop("disable_log_requests") | |
| if "enable_log_requests" not in engine_kwargs: | |
| engine_kwargs["enable_log_requests"] = not disable_log_requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lk-chen +1? we don't want to overwrite what the user has configured right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you are right. Seems we need very careful check here, PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix makes sense
Signed-off-by: Linkun <github@lkchen.net>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lk-chen i don't thing it makes sense to do this hacky fixes due to backward compatibility issues introduced with vllm. We should only do forward fixes when we upgrade vllm in ray-llm image. Everytime a vllm is released and upgrade we make sure the interfaces work with that interface.
|
It makes sense to me. If ray has a release at t+1, vllm has a release at t+2. In case of non-backward-compatible change, we submit fix at t+4, and wait for next release at t+7 (or later). We are facing a case where user has no way to try new vllm release between t+2 and t+4 (potentially losing users). And between t+4 and t+7, we have to ask user to try nightly release which I assume most users don't want to / know how to (potentially losing users again). After t+7, user started trying, and hit an exception because we change config name. Some of them try to fix, others lose patience and leave ray. So if I am aware of these breaking change at t+0, I'll just try to be compatible and catch the t+1 release, such that users are happy between t+2 and t+7. |
|
Premerge failures cc @lk-chen
GPU Tests
|
| "Disabling request logging by default. To enable, set to False in engine_kwargs." | ||
| ) | ||
| engine_kwargs["disable_log_requests"] = True | ||
| from vllm.engine.arg_utils import AsyncEngineArgs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a TODO that this is temporary and what should be done to remove it and when (when 0.10.1 is released and upgraded)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| engine_kwargs["disable_log_requests"] = True | ||
| from vllm.engine.arg_utils import AsyncEngineArgs | ||
|
|
||
| both_log_flags_set = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are over complicating the logic here.
simply do this:
Query the fileds of AynscEngineArgs and see if it is disable_log_requests or enable_log_requests. Then set the corresponding parameter correctly.
Also run the release tests + test on the gptoss wheel manually to make sure it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Linkun <github@lkchen.net>
|
release passed in https://buildkite.com/ray-project/release/builds/52804 |
| logger.info( | ||
| "Disabling request logging by default. To enable, set to False in engine_kwargs." | ||
| ) | ||
| engine_kwargs["disable_log_requests"] = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we want to put this in the else below right?
if hasattr(AsyncEngineArgs, "enable_log_requests"):
if "disable_log_requests" in engine_kwargs:
logger.warning(
"disable_log_requests is set in engine_kwargs, but vLLM "
"does not support it. Converting to enable_log_requests."
)
engine_kwargs["enable_log_requests"] = not engine_kwargs.pop(
"disable_log_requests"
)
else:
engine_kwargs["enable_log_requests"] = False
elif "disable_log_requests" not in engine_kwargs:
logger.info(
"Disabling request logging by default. To enable, set to False in engine_kwargs."
)
engine_kwargs["disable_log_requests"] = True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
o.w. in <0.10.1 we won't get disable_log_requests = True at all so requests will be logged by vllm which is a UX change as it will clutter output std.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, fixed
enable_log_requests if disable_log_requests is missingenable_log_requests if disable_log_requests is missing
Signed-off-by: Linkun <github@lkchen.net>
…log_requests` is missing (#55336) Signed-off-by: Linkun <github@lkchen.net> Signed-off-by: sampan <sampan@anyscale.com>
…log_requests` is missing (ray-project#55336) Signed-off-by: Linkun <github@lkchen.net> Signed-off-by: Andrew Grosser <dioptre@gmail.com>
…log_requests` is missing (ray-project#55336) Signed-off-by: Linkun <github@lkchen.net> Signed-off-by: jugalshah291 <shah.jugal291@gmail.com>
…log_requests` is missing (#55336) Signed-off-by: Linkun <github@lkchen.net> Signed-off-by: Douglas Strodtman <douglas@anyscale.com>

Why are these changes needed?
Related issue number
FIX #55314
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.