Skip to content

Conversation

@lk-chen
Copy link
Contributor

@lk-chen lk-chen commented Aug 7, 2025

Why are these changes needed?

Related issue number

FIX #55314

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Linkun <github@lkchen.net>
@lk-chen lk-chen requested a review from a team as a code owner August 7, 2025 01:53
@lk-chen lk-chen added the go add ONLY when ready to merge, run all tests label Aug 7, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @lk-chen, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a breaking change in the vLLM library's API related to request logging. I've updated the vLLM model initialization logic within Ray's LLM module to gracefully handle the transition from a disable_log_requests parameter to an enable_log_requests parameter, ensuring compatibility with newer vLLM versions.

Highlights

  • vLLM API Adaptation: I've implemented a compatibility layer for vLLM's logging configuration, specifically addressing a change from disable_log_requests to enable_log_requests.
  • Conditional Logic: The code now checks for the presence of disable_log_requests on vllm.engine.arg_utils.AsyncEngineArgs.
  • Parameter Conversion: If disable_log_requests is not found, the existing disable_log_requests value is popped from engine_kwargs and its boolean inverse is assigned to enable_log_requests.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@nrghosh nrghosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm pending release tests - thanks!

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to adapt to a breaking change in the vLLM API, where disable_log_requests was renamed to enable_log_requests. The implementation correctly handles the backward compatibility. However, I've identified a bug where a user-provided enable_log_requests value would be overwritten by the default logic. I have provided a review comment with a suggested fix to address this issue, ensuring that user configurations are respected.

Comment on lines 106 to 107
disable_log_requests = engine_kwargs.pop("disable_log_requests")
engine_kwargs["enable_log_requests"] = not disable_log_requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This logic has a flaw that can cause a user's enable_log_requests setting to be overwritten. The issue stems from interaction with the preceding code (lines 96-100) which sets a default for disable_log_requests. This block then unconditionally uses that value to set enable_log_requests, overriding any user-provided value for the latter.

To fix this, you should only set enable_log_requests if it hasn't already been provided by the user.

Suggested change
disable_log_requests = engine_kwargs.pop("disable_log_requests")
engine_kwargs["enable_log_requests"] = not disable_log_requests
disable_log_requests = engine_kwargs.pop("disable_log_requests")
if "enable_log_requests" not in engine_kwargs:
engine_kwargs["enable_log_requests"] = not disable_log_requests

Copy link
Contributor

@nrghosh nrghosh Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lk-chen +1? we don't want to overwrite what the user has configured right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right. Seems we need very careful check here, PTAL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix makes sense

Signed-off-by: Linkun <github@lkchen.net>
Copy link
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lk-chen i don't thing it makes sense to do this hacky fixes due to backward compatibility issues introduced with vllm. We should only do forward fixes when we upgrade vllm in ray-llm image. Everytime a vllm is released and upgrade we make sure the interfaces work with that interface.

@lk-chen
Copy link
Contributor Author

lk-chen commented Aug 7, 2025

It makes sense to me.

If ray has a release at t+1, vllm has a release at t+2. In case of non-backward-compatible change, we submit fix at t+4, and wait for next release at t+7 (or later). We are facing a case where user has no way to try new vllm release between t+2 and t+4 (potentially losing users). And between t+4 and t+7, we have to ask user to try nightly release which I assume most users don't want to / know how to (potentially losing users again). After t+7, user started trying, and hit an exception because we change config name. Some of them try to fix, others lose patience and leave ray.

So if I am aware of these breaking change at t+0, I'll just try to be compatible and catch the t+1 release, such that users are happy between t+2 and t+7.

@nrghosh
Copy link
Contributor

nrghosh commented Aug 7, 2025

Premerge failures cc @lk-chen
CPU Tests:
All pass for me locally when I check out this PR - maybe rerun?

  • python/ray/llm/tests/batch/cpu/processor/test_processor_base.py::test_empty_processor
  • python/ray/llm/tests/batch/cpu/processor/test_processor_base.py::test_processor_with_no_preprocess_or_postprocess

GPU Tests

  • /python/ray/llm/tests:batch/gpu/processor/test_vllm_engine_proc [all passed locally]
  • linux://doc/source/llm/examples/batch:vllm-with-structural-output [did not reproduce]
(base) ray@ip-10-0-224-43:~/default/work/ray$ 
(base) ray@ip-10-0-224-43:~/default/work/ray$ pytest -vvs python/ray/llm/tests/batch/cpu/processor/test_processor_base.py::test_empty_processor
===================================================================================== test session starts =====================================================================================
...

PASSED                                                                                                                                              
                                                                                                                                                                                               
===================================================================================== 1 passed in 18.41s ======================================================================================
(base) ray@ip-10-0-224-43:~/default/work/ray$ 
(base) ray@ip-10-0-224-43:~/default/work/ray$ pytest -vvs python/ray/llm/tests/batch/cpu/processor/test_processor_base.py::test_processor_with_no_preprocess_or_postprocess
===================================================================================== test session starts =====================================================================================
...

PASSED                                                                                                                                                                                         
                                                                                                                                                                                               
===================================================================================== 1 passed in 13.18s ======================================================================================
(base) ray@ip-10-0-224-43:~/default/work/ray$ git branch
* adapt_vllm_enable_log_requests
...

(base) ray@ip-10-0-224-43:~/default/work/ray$ pytest -vs python/ray/llm/tests/batch/gpu/processor/test_vllm_engine_proc.py
===================================================================================== test session starts =====================================================================================
platform linux -- Python 3.11.11, pytest-8.4.1, pluggy-1.5.0 -- /home/ray/anaconda3/bin/python
cachedir: .pytest_cache
rootdir: /home/ray/default/work/ray
configfile: pytest.ini
plugins: asyncio-1.1.0, anyio-3.7.1
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 6 items                                                                                                            

...


================================================================================ 6 passed in 213.11s (0:03:33) ================================================================================
(base) ray@ip-10-0-224-43:~/default/work/ray$ 

"Disabling request logging by default. To enable, set to False in engine_kwargs."
)
engine_kwargs["disable_log_requests"] = True
from vllm.engine.arg_utils import AsyncEngineArgs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO that this is temporary and what should be done to remove it and when (when 0.10.1 is released and upgraded)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

engine_kwargs["disable_log_requests"] = True
from vllm.engine.arg_utils import AsyncEngineArgs

both_log_flags_set = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are over complicating the logic here.

simply do this:

Query the fileds of AynscEngineArgs and see if it is disable_log_requests or enable_log_requests. Then set the corresponding parameter correctly.

Also run the release tests + test on the gptoss wheel manually to make sure it works.

Copy link
Contributor Author

@lk-chen lk-chen Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for future reference
image

done, simplified the logic

Signed-off-by: Linkun <github@lkchen.net>
Signed-off-by: Linkun <github@lkchen.net>
@lk-chen lk-chen requested a review from kouroshHakha August 11, 2025 20:08
@lk-chen
Copy link
Contributor Author

lk-chen commented Aug 11, 2025

release passed in https://buildkite.com/ray-project/release/builds/52804
manually verified vllm:0.10.0+gptoss

logger.info(
"Disabling request logging by default. To enable, set to False in engine_kwargs."
)
engine_kwargs["disable_log_requests"] = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we want to put this in the else below right?

if hasattr(AsyncEngineArgs, "enable_log_requests"):
            if "disable_log_requests" in engine_kwargs:
                logger.warning(
                    "disable_log_requests is set in engine_kwargs, but vLLM "
                    "does not support it. Converting to enable_log_requests."
                )
                engine_kwargs["enable_log_requests"] = not engine_kwargs.pop(
                    "disable_log_requests"
                )
            else:
                 engine_kwargs["enable_log_requests"]  = False
elif "disable_log_requests" not in engine_kwargs:
            logger.info(
                "Disabling request logging by default. To enable, set to False in engine_kwargs."
            )
            engine_kwargs["disable_log_requests"] = True

Copy link
Contributor

@kouroshHakha kouroshHakha Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

o.w. in <0.10.1 we won't get disable_log_requests = True at all so requests will be logged by vllm which is a UX change as it will clutter output std.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, fixed

@kouroshHakha kouroshHakha changed the title [llm] Adapt vLLM change, use enable_log_requests if disable_log_requests is missing [Serve.llm] Adapt vLLM change, use enable_log_requests if disable_log_requests is missing Aug 11, 2025
Signed-off-by: Linkun <github@lkchen.net>
@kouroshHakha kouroshHakha enabled auto-merge (squash) August 12, 2025 00:20
@kouroshHakha kouroshHakha merged commit de4412a into ray-project:master Aug 12, 2025
6 checks passed
sampan-s-nayak pushed a commit that referenced this pull request Aug 12, 2025
…log_requests` is missing (#55336)

Signed-off-by: Linkun <github@lkchen.net>
Signed-off-by: sampan <sampan@anyscale.com>
dioptre pushed a commit to sourcetable/ray that referenced this pull request Aug 20, 2025
…log_requests` is missing (ray-project#55336)

Signed-off-by: Linkun <github@lkchen.net>
Signed-off-by: Andrew Grosser <dioptre@gmail.com>
jugalshah291 pushed a commit to jugalshah291/ray_fork that referenced this pull request Sep 11, 2025
…log_requests` is missing (ray-project#55336)

Signed-off-by: Linkun <github@lkchen.net>
Signed-off-by: jugalshah291 <shah.jugal291@gmail.com>
dstrodtman pushed a commit that referenced this pull request Oct 6, 2025
…log_requests` is missing (#55336)

Signed-off-by: Linkun <github@lkchen.net>
Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[serve.llm] deploying gptoss throws an error due to missing disable-log-requests from async engine args

3 participants