Skip to content

Improve inter-process queue fairness #839

Closed
@deliahu

Description

@deliahu

Description

When using the Uvicorn process manager, requests are assigned to Uvicorn workers randomly. This causes issues including unbalanced queue wait times and the max concurrency limit not working as expected (since each process applies max concurrency independently).

Possible solutions

NGINX

See #1298

Gunicorn + Uvicorn workers

Consider switching to Gunicorn + Uvicorn workers. Gunicorn is a more full-featured process manager than Uvicorn's built in one, and may balance requests across processes better.

Blockers for switching to Gunicorn

Currently there are two features of the Uvicorn process manager that are not supported by Gunicorn:

  • --limit-concurrency is used to respond with 503s when the user specified concurrency limit is reached. Here, Uvicorn currently says "Gunicorn provides a different set of configuration options to Uvicorn, so some options such as --limit-concurrency are not yet supported when running with Gunicorn."
  • It is not currently possible to configure how many threads are used by the Uvicorn worker

Here is the Uvicorn Changelog

Add request forwarder sidecar container

A sidecar container could receive all requests, count in-flight requests, manage max_replica_concurrency, and forward them to the application container. Within the application container, we'd use FastAPI on Gunicorn with Uvicorn workers (for worker queue fairness), and configure unlimited backlog and limit-concurrency. Knative does something similar to this.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions