Skip to content

[serve] Make latency buckets configurable #38223

Open
@zcin

Description

Description

The latency buckets used for the latency histogram heavily affect how we calculate P50, P90 etc latency on Grafana. We should make this configurable so users who know roughly how long their requests will take can modify the buckets to match that.

Use case

No response

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

P1Issue that should be fixed within a few weeksenhancementRequest for new feature and/or capabilityray-team-createdRay Team createdserveRay Serve Related Issue

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions