Skip to content

Conversation

eicherseiji
Copy link
Contributor

@eicherseiji eicherseiji commented Oct 8, 2025

Why are these changes needed?

  • Serve deployment handle routers benefit from arbitrary headers on the raw HTTP request
  • vLLM engine expects access to the raw HTTP headers for request ID

Propagation:

FastAPI Endpoint (router.py)
    ↓ request: Request
LLMRouter methods (chat, completions, embeddings, score)
    ↓ raw_request
_process_llm_request / _get_response
    ↓ raw_request
DeploymentHandle.remote(body, raw_request)
    ↓ raw_request
LLMServer methods
    ↓ raw_request
_run_request
    ↓ raw_request
LLMEngine methods (vLLMEngine)
    ↓ raw_request
vLLM OpenAI serving methods

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run pre-commit jobs to lint the changes in this PR. (pre-commit setup)
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Seiji Eicher <seiji@anyscale.com>
@eicherseiji eicherseiji added the go add ONLY when ready to merge, run all tests label Oct 8, 2025
model_config = ConfigDict(arbitrary_types_allowed=True)


class ScoreResponse(vLLMScoreResponse):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kouroshHakha do you want to leave these here?

@eicherseiji eicherseiji marked this pull request as ready for review October 9, 2025 03:41
@eicherseiji eicherseiji requested a review from a team as a code owner October 9, 2025 03:41
@ray-gardener ray-gardener bot added serve Ray Serve Related Issue llm labels Oct 9, 2025
@eicherseiji eicherseiji changed the title [serve][llm] Remove request ID workaround [serve][llm] Deliver raw HTTP requests to LLM engine Oct 9, 2025
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
cursor[bot]

This comment was marked as outdated.

Signed-off-by: Seiji Eicher <seiji@anyscale.com>
@eicherseiji eicherseiji marked this pull request as draft October 10, 2025 00:53
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests llm serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant