Skip to content

set ContentLength when forwarding requests#717

Open
ericcurtin wants to merge 1 commit intomainfrom
some-fix
Open

set ContentLength when forwarding requests#717
ericcurtin wants to merge 1 commit intomainfrom
some-fix

Conversation

@ericcurtin
Copy link
Contributor

HTTP/2 clients (e.g. Java HttpClient with HTTP_2 version) often omit the Content-Length header since HTTP/2 uses DATA frames for body framing. When DMR's reverse proxy forwards such requests to the backend via HTTP/1.1, it uses Transfer-Encoding: chunked (ContentLength == -1), which vLLM's Python/uvicorn server fails to parse — resulting in an empty body and a 422 Unprocessable Entity response.

Fix by explicitly setting ContentLength = len(body) on the upstream request after replacing the body with the already-buffered bytes. This ensures a Content-Length header is always sent, consistent with how the Ollama and Anthropic handlers already handle this. llama.cpp was unaffected because its C/C++ HTTP server handles chunked encoding gracefully.

HTTP/2 clients (e.g. Java HttpClient with HTTP_2 version) often omit the
Content-Length header since HTTP/2 uses DATA frames for body framing. When
DMR's reverse proxy forwards such requests to the backend via HTTP/1.1, it
uses Transfer-Encoding: chunked (ContentLength == -1), which vLLM's
Python/uvicorn server fails to parse — resulting in an empty body and a 422
Unprocessable Entity response.

Fix by explicitly setting ContentLength = len(body) on the upstream request
after replacing the body with the already-buffered bytes. This ensures a
Content-Length header is always sent, consistent with how the Ollama and
Anthropic handlers already handle this. llama.cpp was unaffected because its
C/C++ HTTP server handles chunked encoding gracefully.

Signed-off-by: Eric Curtin <eric.curtin@docker.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses an issue where HTTP/2 requests forwarded to backends like vLLM would fail due to the absence of a Content-Length header. The change explicitly sets the ContentLength on the forwarded request after buffering the body. This is a correct and direct fix for the problem, ensuring better compatibility with HTTP/1.1 backends that do not handle chunked transfer encoding well.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@dandyfopper
Copy link

@ericcurtin - Tested on macOS (Apple Silicon) with the patched binary (v1.1.8-2-g771b9b0a) running standalone on port 13434.

The standalone model-runner binary doesn't appear to support h2c, so I wasn't able to exercise the HTTP/2 code path directly:

curl --http2 sends HTTP/1.1 with an Upgrade: h2c header — the server doesn't upgrade, and the request still hits vLLM as HTTP/1.1 (422)
curl --http2-prior-knowledge fails immediately — "peer does not support HTTP/2 properly"
curl --http1.1 works fine (200 OK, as expected)
It seems like the h2c handling only comes into play inside Docker Desktop's networking/proxy layer, not in the standalone binary. Could you advise on how to test this? Specifically:

Is there a way to swap the patched binary into Docker Desktop's model-runner?
Or does Docker Desktop's proxy handle the h2c → HTTP/1.1 translation before it reaches model-runner, meaning the fix needs to be tested inside that stack?
Happy to re-test however you recommend.

Is there a plan to support streaming as well?

@ericcurtin
Copy link
Contributor Author

ericcurtin commented Feb 27, 2026

Streaming in what way? (sometimes that word in used in different ways in this space)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants