[serve][llm] Deliver raw HTTP requests to LLM engine #57583

eicherseiji · 2025-10-08T23:51:19Z

Why are these changes needed?

Serve deployment handle routers benefit from arbitrary headers on the raw HTTP request
vLLM engine expects access to the raw HTTP headers for request ID

Propagation:

FastAPI Endpoint (router.py)
    ↓ request: Request
LLMRouter methods (chat, completions, embeddings, score)
    ↓ raw_request
_process_llm_request / _get_response
    ↓ raw_request
DeploymentHandle.remote(body, raw_request)
    ↓ raw_request
LLMServer methods
    ↓ raw_request
_run_request
    ↓ raw_request
LLMEngine methods (vLLMEngine)
    ↓ raw_request
vLLM OpenAI serving methods

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run pre-commit jobs to lint the changes in this PR. (pre-commit setup)
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

eicherseiji · 2025-10-09T03:41:06Z

python/ray/llm/_internal/serve/configs/openai_api_models.py

-    model_config = ConfigDict(arbitrary_types_allowed=True)
-
-
-class ScoreResponse(vLLMScoreResponse):


@kouroshHakha do you want to leave these here?

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

[serve][llm] Remove request ID workaround

ef357ce

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

eicherseiji added the go add ONLY when ready to merge, run all tests label Oct 8, 2025

eicherseiji and others added 2 commits October 8, 2025 18:10

Remove unnecessary classes

e75ccb4

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Merge branch 'master' into remove-request-id-workaround

f0c85b0

eicherseiji commented Oct 9, 2025

View reviewed changes

eicherseiji marked this pull request as ready for review October 9, 2025 03:41

eicherseiji requested a review from a team as a code owner October 9, 2025 03:41

ray-gardener bot added serve Ray Serve Related Issue llm labels Oct 9, 2025

eicherseiji changed the title ~~[serve][llm] Remove request ID workaround~~ [serve][llm] Deliver raw HTTP requests to LLM engine Oct 9, 2025

Propagate raw HTTP request in Ray Serve LLM APIs

1ec8d02

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

This comment was marked as outdated.

Sign in to view

Missing import

9f7ea65

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

eicherseiji marked this pull request as draft October 10, 2025 00:53

WIP

c34f8c8

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[serve][llm] Deliver raw HTTP requests to LLM engine #57583

[serve][llm] Deliver raw HTTP requests to LLM engine #57583

eicherseiji commented Oct 8, 2025 •

edited

Loading

Uh oh!

eicherseiji Oct 9, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		model_config = ConfigDict(arbitrary_types_allowed=True)


		class ScoreResponse(vLLMScoreResponse):

[serve][llm] Deliver raw HTTP requests to LLM engine #57583

Are you sure you want to change the base?

[serve][llm] Deliver raw HTTP requests to LLM engine #57583

Conversation

eicherseiji commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

eicherseiji Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

eicherseiji commented Oct 8, 2025 •

edited

Loading