Skip to content

Documentation for ingester instance limits in runtime configuration #4510

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 4, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 37 additions & 1 deletion docs/configuration/arguments.md
Original file line number Diff line number Diff line change
Expand Up @@ -404,7 +404,7 @@ It also talks to a KVStore and has it's own copies of the same flags used by the

Cortex has a concept of "runtime config" file, which is simply a file that is reloaded while Cortex is running. It is used by some Cortex components to allow operator to change some aspects of Cortex configuration without restarting it. File is specified by using `-runtime-config.file=<filename>` flag and reload period (which defaults to 10 seconds) can be changed by `-runtime-config.reload-period=<duration>` flag. Previously this mechanism was only used by limits overrides, and flags were called `-limits.per-user-override-config=<filename>` and `-limits.per-user-override-period=10s` respectively. These are still used, if `-runtime-config.file=<filename>` is not specified.

At the moment, two components use runtime configuration: limits and multi KV store.
At the moment runtime configuration may contain per-user limits, multi KV store, and ingester instance limits.

Example runtime configuration file:

Expand All @@ -422,6 +422,10 @@ overrides:
multi_kv_config:
mirror_enabled: false
primary: memberlist

ingester_limits:
max_ingestion_rate: 42000
max_inflight_push_requests: 10000
```

When running Cortex on Kubernetes, store this file in a config map and mount it in each services' containers. When changing the values there is no need to restart the services, unless otherwise specified.
Expand Down Expand Up @@ -506,6 +510,38 @@ Valid per-tenant limits are (with their corresponding flags for default values):

Requires `-distributor.replication-factor`, `-distributor.shard-by-all-labels`, `-distributor.sharding-strategy` and `-distributor.zone-awareness-enabled` set for the ingesters too.

## Ingester Instance Limits

Cortex ingesters support limits that are applied per-instance, meaning they apply to each ingester process. These can be used to ensure individual ingesters are not overwhelmed regardless of any per-user limits. These limits can be set under the `ingester.instance_limits` block in the global configuration file, with command line flags, or under the `ingester_limits` field in the runtime configuration file.

An example as part of the runtime configuration file:

```yaml
ingester_limits:
max_ingestion_rate: 20000
max_series: 1500000
max_tenants: 1000
max_inflight_push_requests: 30000
```

Valid ingester instance limits are (with their corresponding flags):

- `max_ingestion_rate` \ `--ingester.instance-limits.max-ingestion-rate`

Limit the ingestion rate in samples per second for an ingester. When this limit is reached, new requests will fail with an HTTP 500 error.

- `max_series` \ `-ingester.instance-limits.max-series`

Limit the total number of series that an ingester keeps in memory, across all users. When this limit is reached, requests that create new series will fail with an HTTP 500 error.

- `max_tenants` \ `-ingester.instance-limits.max-tenants`

Limit the maximum number of users an ingester will accept metrics for. When this limit is reached, requests from new users will fail with an HTTP 500 error.

- `max_inflight_push_requests` \ `-ingester.instance-limits.max-inflight-push-requests`

Limit the maximum number of requests being handled by an ingester at once. This setting is critical for preventing ingesters from using an excessive amount of memory during high load or temporary slow downs. When this limit is reached, new requests will fail with an HTTP 500 error.

## Storage

- `s3.force-path-style`
Expand Down