[Feature]: support `x-request-id` header #9593

cjackal · 2024-10-22T18:10:58Z

🚀 The feature, motivation and pitch

Related: #9550

It is a common approach for server admin to backtrack client requests via a "request id"(cf. this SE post). It is a unique identifier assigned to each HTTP request which helps grepping the request info from server log.

OpenAI supports this in the form of response header (cf. API Reference - OpenAI API). Any response from OpenAI API server has x-request-id header, and openai-python package provides a convenient way to retrieve this header. (cf. openai/_models.py)

In the referenced PR, we observed the need for request identification (a typical usecase of x-request-id), so it may be a good time to support this optional HTTP header in online serving.

Suggestion:

Each API response is sent with a x-request-id header
When x-request-id header is given in user request, send it back to the response header ("Idempotency")
Otherwise, x-request-id is a random uuid hex value ("compatibility with OpenAI API behavior")

Demo:

Case 1. X-Request-Id header is not given - return random hex

$ curl -v -X POST http://localhost:8000/v1/chat/completions \
> -d '{"model":"meta-llama/Llama-3.2-1B-Instruct","messages":[{"role":"user","content":"Hi"}]}' \
> -H 'Content-Type: Application/json'
*   Trying 127.0.0.1:8000...
* Connected to localhost (127.0.0.1) port 8000 (#0)
> POST /v1/chat/completions HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.68.0
> Accept: */*
> X-Request-Id: aaaa
> Content-Type: application/json
> Content-Length: 88
> 
< HTTP/1.1 200 OK
< date: Tue, 22 Oct 2024 15:58:16 GMT
< server: uvicorn
< content-length: 230
< content-type: application/json
< x-request-id: 26970bda2e124bb8ad293d3d25d8a4cf
...

Case 2. X-Request-Id header is specified - pass it back

$ curl -v -X POST http://localhost:8000/v1/chat/completions \
> -d '{"model":"meta-llama/Llama-3.2-1B-Instruct","messages":[{"role":"user","content":"Hi"}]}' \
> -H 'X-Request-Id: aaaa' \
> -H 'Content-Type: Application/json'
*   Trying 127.0.0.1:8000...
* Connected to localhost (127.0.0.1) port 8000 (#0)
> POST /v1/chat/completions HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.68.0
> Accept: */*
> X-Request-Id: aaaa
> Content-Type: application/json
> Content-Length: 88
> 
< HTTP/1.1 200 OK
< date: Tue, 22 Oct 2024 15:58:16 GMT
< server: uvicorn
< content-length: 230
< content-type: application/json
< x-request-id: aaaa
...

Alternatives

#9550 achieves similar goal, but this PR has some pros:

this PR lies within OpenAI API spec while [Frontend] Support custom request_id from request #9550 extends the API spec (possibility in API conflict in the future?)
OpenAI SDKs supports x-request-id in the way that returns the exact value set by a client (e,g, response._request_id attribute in openai-python), so no post-processing needed
HTTP header is easier to handle than HTTP body, e.g. it is easier to grep request id using curl cmd etc.

Additional context

Some other language model APIs send request id (anthropic, hyperclova)

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

cjackal added the feature request label Oct 22, 2024

cjackal linked a pull request Oct 22, 2024 that will close this issue

[Frontend] add add_request_id middleware #9594

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: support `x-request-id` header #9593

[Feature]: support `x-request-id` header #9593

cjackal commented Oct 22, 2024

[Feature]: support x-request-id header #9593

[Feature]: support x-request-id header #9593

Comments

cjackal commented Oct 22, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

[Feature]: support `x-request-id` header #9593

[Feature]: support `x-request-id` header #9593