Skip to content

[serve.llm] deploying gptoss throws an error due to missing disable-log-requests from async engine args #55314

@kouroshHakha

Description

@kouroshHakha

What happened + What you expected to happen

vllm-project/vllm#21739 has hard deprecated the disable_log_requests parameter without keeping the backward compatibility at this layer. As a result ray serve llm does not work with the latest wheel released with gptoss (vllm 0.10.1). We need to basically upgrade ray serve llm to use enable_log_requests which will be introduced after 0.10.1 is officially released. Then the nightly of ray will work with gpt-oss deployment.

To get around this issue you can comment out this part: https://github.com/ray-project/ray/blob/master/python/ray/llm/_internal/serve/deployments/llm/vllm/vllm_models.py#L100 and build ray from source.

Versions / Dependencies

ray nightly @commit 9fb05f
vllm 0.10.1+gptoss

Reproduction script

from ray import serve
from ray.serve.llm import LLMConfig, build_openai_app

llm_config = LLMConfig(
    model_loading_config=dict(
        model_id="openai/gpt-oss-20b",
    ),
    deployment_config=dict(
        autoscaling_config=dict(
            min_replicas=1, max_replicas=2,
        )
    ),
    # You can customize the engine arguments (e.g. vLLM engine kwargs)
    engine_kwargs=dict(
        tensor_parallel_size=2,
    )
)

app = build_openai_app({"llm_configs": [llm_config]})
serve.run(app, blocking=True)

Issue Severity

Low: It annoys or frustrates me.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1Issue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tllm

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions