Description
Is your feature request related to a problem? Please describe.
When running Cortex with tenants ingesting at different rates, a single user may greatly increase their ingestion rate and cause distributors to get OOMKilled affecting ingestion for other users. Distributor should allow per-user throttling on ingestion to prevent noisy neighbors.
Describe the solution you'd like
Add a per-user token bucket in distributors to limit ingestion/throttle single users, similar to how store-gateway has token bucket throttling.
Describe alternatives you've considered
I've considered using distributor limits like max_inflight_push_requests
to prevent distributors from getting into high CPU and memory situations overall but then distributors may drop requests from smaller users in the cluster if there is a large user taking up most of the ingestion room.
ingestion_rate
limit is also per-user but it limits the number of overall samples, and not the number of requests so a user with a small batch size but high TPS could still cause problems.