Skip to content

Commit 8a55590

Browse files
pstibranypracucci
andauthored
Frontend scaling (#3374)
* Created new version of frontend package, with separate scheduler component. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com> * Fix roundtripper wrapping. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com> * Review feedback. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com> * Review feedback. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com> * Fixed docs. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com> * Scheduler now sends OK after frontend connects. This allows scheduler to also send shutting down error to frontend immediately. Frontend worker expects OK, and exits FrontendLoop otherwise. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com> * Fixed naming. Signed-off-by: Peter Štibraný <peter.stibrany@grafana.com> * Minor tweaks Signed-off-by: Marco Pracucci <marco@pracucci.com> Co-authored-by: Marco Pracucci <marco@pracucci.com>
1 parent 83ad6df commit 8a55590

33 files changed

+5944
-304
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
- `cortex_ingester_tsdb_head_truncations_total`
99
- `cortex_ingester_tsdb_head_gc_duration_seconds`
1010
* [ENHANCEMENT] Added `cortex_alertmanager_config_hash` metric to expose hash of Alertmanager Config loaded per user. #3388
11+
* [ENHANCEMENT] Query-Frontend / Query-Scheduler: New component called "Query-Scheduler" has been introduced. Query-Scheduler is simply a queue of requests, moved outside of Query-Frontend. This allows Query-Frontend to be scaled separately from number of queues. To make Query-Frontend and Querier use Query-Scheduler, they need to be started with `-frontend.scheduler-address` and `-querier.scheduler-address` options respectively. #3374
1112

1213
## 1.5.0 in progress
1314

development/tsdb-blocks-storage-s3/config/cortex.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,10 @@ store_gateway:
120120

121121
frontend_worker:
122122
frontend_address: "query-frontend:9007"
123+
match_max_concurrent: true
124+
125+
# By setting scheduler_address, querier worker would use scheduler instead of frontend.
126+
# scheduler_address: "query-scheduler:9012"
123127

124128
query_range:
125129
split_queries_by_interval: 24h

development/tsdb-blocks-storage-s3/config/grafana-agent.yaml

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ prometheus:
2525
namespace: 'tsdb-blocks-storage-s3'
2626
- job_name: tsdb-blocks-storage-s3/querier
2727
static_configs:
28-
- targets: ['querier:8004']
28+
- targets: ['querier:8004', 'querier-with-scheduler:8013']
2929
labels:
3030
cluster: 'docker-compose'
3131
namespace: 'tsdb-blocks-storage-s3'
@@ -43,7 +43,7 @@ prometheus:
4343
namespace: 'tsdb-blocks-storage-s3'
4444
- job_name: tsdb-blocks-storage-s3/query-frontend
4545
static_configs:
46-
- targets: ['query-frontend:8007']
46+
- targets: ['query-frontend:8007', 'query-frontend-with-scheduler:8012']
4747
labels:
4848
cluster: 'docker-compose'
4949
namespace: 'tsdb-blocks-storage-s3'
@@ -53,6 +53,12 @@ prometheus:
5353
labels:
5454
cluster: 'docker-compose'
5555
namespace: 'tsdb-blocks-storage-s3'
56+
- job_name: tsdb-blocks-storage-s3/query-scheduler
57+
static_configs:
58+
- targets: ['query-scheduler:8011']
59+
labels:
60+
cluster: 'docker-compose'
61+
namespace: 'tsdb-blocks-storage-s3'
5662

5763
remote_write:
5864
- url: http://distributor:8001/api/prom/push

development/tsdb-blocks-storage-s3/config/prometheus.yaml

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ scrape_configs:
1818
namespace: 'tsdb-blocks-storage-s3'
1919
- job_name: tsdb-blocks-storage-s3/querier
2020
static_configs:
21-
- targets: ['querier:8004']
21+
- targets: ['querier:8004', 'query-frontend-with-scheduler:8013']
2222
labels:
2323
cluster: 'docker-compose'
2424
namespace: 'tsdb-blocks-storage-s3'
@@ -36,7 +36,7 @@ scrape_configs:
3636
namespace: 'tsdb-blocks-storage-s3'
3737
- job_name: tsdb-blocks-storage-s3/query-frontend
3838
static_configs:
39-
- targets: ['query-frontend:8007']
39+
- targets: ['query-frontend:8007', 'query-frontend-with-scheduler:8012']
4040
labels:
4141
cluster: 'docker-compose'
4242
namespace: 'tsdb-blocks-storage-s3'
@@ -46,6 +46,12 @@ scrape_configs:
4646
labels:
4747
cluster: 'docker-compose'
4848
namespace: 'tsdb-blocks-storage-s3'
49+
- job_name: tsdb-blocks-storage-s3/query-scheduler
50+
static_configs:
51+
- targets: ['query-scheduler:8011']
52+
labels:
53+
cluster: 'docker-compose'
54+
namespace: 'tsdb-blocks-storage-s3'
4955

5056
remote_write:
5157
- url: http://distributor:8001/api/prom/push

development/tsdb-blocks-storage-s3/docker-compose.yml

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -271,3 +271,68 @@ services:
271271
- 18022:18022
272272
volumes:
273273
- ./config:/cortex/config
274+
275+
query-scheduler:
276+
build:
277+
context: .
278+
dockerfile: dev.dockerfile
279+
image: cortex
280+
command: ["sh", "-c", "sleep 3 && exec ./dlv exec ./cortex --listen=:18011 --headless=true --api-version=2 --accept-multiclient --continue -- -config.file=./config/cortex.yaml -target=query-scheduler -server.http-listen-port=8011 -server.grpc-listen-port=9011 -store.max-query-length=8760h -log.level=debug"]
281+
depends_on:
282+
- consul
283+
- minio
284+
environment:
285+
- JAEGER_AGENT_HOST=jaeger
286+
- JAEGER_AGENT_PORT=6831
287+
- JAEGER_TAGS=app=query-scheduler
288+
- JAEGER_SAMPLER_TYPE=const
289+
- JAEGER_SAMPLER_PARAM=1
290+
ports:
291+
- 8011:8011
292+
- 18011:18011
293+
volumes:
294+
- ./config:/cortex/config
295+
296+
# This frontend uses query-scheduler, activated by `-frontend.scheduler-address` option.
297+
query-frontend-with-scheduler:
298+
build:
299+
context: .
300+
dockerfile: dev.dockerfile
301+
image: cortex
302+
command: ["sh", "-c", "sleep 3 && exec ./dlv exec ./cortex --listen=:18012 --headless=true --api-version=2 --accept-multiclient --continue -- -config.file=./config/cortex.yaml -target=query-frontend -server.http-listen-port=8012 -server.grpc-listen-port=9012 -store.max-query-length=8760h -frontend.scheduler-address=query-scheduler:9011 -log.level=debug"]
303+
depends_on:
304+
- consul
305+
- minio
306+
environment:
307+
- JAEGER_AGENT_HOST=jaeger
308+
- JAEGER_AGENT_PORT=6831
309+
- JAEGER_TAGS=app=query-frontend2
310+
- JAEGER_SAMPLER_TYPE=const
311+
- JAEGER_SAMPLER_PARAM=1
312+
ports:
313+
- 8012:8012
314+
- 18012:18012
315+
volumes:
316+
- ./config:/cortex/config
317+
318+
# This querier is connecting to query-scheduler, instead of query-frontend. This is achieved by setting -querier.scheduler-address="..."
319+
querier-with-scheduler:
320+
build:
321+
context: .
322+
dockerfile: dev.dockerfile
323+
image: cortex
324+
command: ["sh", "-c", "sleep 3 && exec ./dlv exec ./cortex --listen=:18013 --headless=true --api-version=2 --accept-multiclient --continue -- -config.file=./config/cortex.yaml -target=querier -server.http-listen-port=8013 -server.grpc-listen-port=9013 -querier.scheduler-address=query-scheduler:9011 -log.level=debug"]
325+
depends_on:
326+
- consul
327+
- minio
328+
environment:
329+
- JAEGER_AGENT_HOST=jaeger
330+
- JAEGER_AGENT_PORT=6831
331+
- JAEGER_TAGS=app=querier-scheduler
332+
- JAEGER_SAMPLER_TYPE=const
333+
- JAEGER_SAMPLER_PARAM=1
334+
ports:
335+
- 8013:8013
336+
- 18013:18013
337+
volumes:
338+
- ./config:/cortex/config

docs/configuration/config-file-reference.md

Lines changed: 118 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,13 @@ runtime_config:
157157

158158
# The memberlist_config configures the Gossip memberlist.
159159
[memberlist: <memberlist_config>]
160+
161+
query_scheduler:
162+
# Maximum number of outstanding requests per tenant per query-scheduler.
163+
# In-flight requests above this limit will fail with HTTP response status code
164+
# 429.
165+
# CLI flag: -query-scheduler.max-outstanding-requests-per-tenant
166+
[max_outstanding_requests_per_tenant: <int> | default = 100]
160167
```
161168
162169
### `server_config`
@@ -757,27 +764,109 @@ store_gateway_client:
757764
The `query_frontend_config` configures the Cortex query-frontend.
758765

759766
```yaml
767+
# Log queries that are slower than the specified duration. Set to 0 to disable.
768+
# Set to < 0 to enable on all queries.
769+
# CLI flag: -frontend.log-queries-longer-than
770+
[log_queries_longer_than: <duration> | default = 0s]
771+
772+
# Max body size for downstream prometheus.
773+
# CLI flag: -frontend.max-body-size
774+
[max_body_size: <int> | default = 10485760]
775+
760776
# Maximum number of outstanding requests per tenant per frontend; requests
761777
# beyond this error with HTTP 429.
762778
# CLI flag: -querier.max-outstanding-requests-per-tenant
763779
[max_outstanding_per_tenant: <int> | default = 100]
764780
781+
# DNS hostname used for finding query-schedulers.
782+
# CLI flag: -frontend.scheduler-address
783+
[scheduler_address: <string> | default = ""]
784+
785+
# How often to resolve the scheduler-address, in order to look for new
786+
# query-scheduler instances.
787+
# CLI flag: -frontend.scheduler-dns-lookup-period
788+
[scheduler_dns_lookup_period: <duration> | default = 10s]
789+
790+
# Number of concurrent workers forwarding queries to single query-scheduler.
791+
# CLI flag: -frontend.scheduler-worker-concurrency
792+
[scheduler_worker_concurrency: <int> | default = 5]
793+
794+
grpc_client_config:
795+
# gRPC client max receive message size (bytes).
796+
# CLI flag: -frontend.grpc-client-config.grpc-max-recv-msg-size
797+
[max_recv_msg_size: <int> | default = 104857600]
798+
799+
# gRPC client max send message size (bytes).
800+
# CLI flag: -frontend.grpc-client-config.grpc-max-send-msg-size
801+
[max_send_msg_size: <int> | default = 16777216]
802+
803+
# Deprecated: Use gzip compression when sending messages. If true, overrides
804+
# grpc-compression flag.
805+
# CLI flag: -frontend.grpc-client-config.grpc-use-gzip-compression
806+
[use_gzip_compression: <boolean> | default = false]
807+
808+
# Use compression when sending messages. Supported values are: 'gzip',
809+
# 'snappy' and '' (disable compression)
810+
# CLI flag: -frontend.grpc-client-config.grpc-compression
811+
[grpc_compression: <string> | default = ""]
812+
813+
# Rate limit for gRPC client; 0 means disabled.
814+
# CLI flag: -frontend.grpc-client-config.grpc-client-rate-limit
815+
[rate_limit: <float> | default = 0]
816+
817+
# Rate limit burst for gRPC client.
818+
# CLI flag: -frontend.grpc-client-config.grpc-client-rate-limit-burst
819+
[rate_limit_burst: <int> | default = 0]
820+
821+
# Enable backoff and retry when we hit ratelimits.
822+
# CLI flag: -frontend.grpc-client-config.backoff-on-ratelimits
823+
[backoff_on_ratelimits: <boolean> | default = false]
824+
825+
backoff_config:
826+
# Minimum delay when backing off.
827+
# CLI flag: -frontend.grpc-client-config.backoff-min-period
828+
[min_period: <duration> | default = 100ms]
829+
830+
# Maximum delay when backing off.
831+
# CLI flag: -frontend.grpc-client-config.backoff-max-period
832+
[max_period: <duration> | default = 10s]
833+
834+
# Number of times to backoff and retry before failing.
835+
# CLI flag: -frontend.grpc-client-config.backoff-retries
836+
[max_retries: <int> | default = 10]
837+
838+
# Path to the client certificate file, which will be used for authenticating
839+
# with the server. Also requires the key path to be configured.
840+
# CLI flag: -frontend.grpc-client-config.tls-cert-path
841+
[tls_cert_path: <string> | default = ""]
842+
843+
# Path to the key file for the client certificate. Also requires the client
844+
# certificate to be configured.
845+
# CLI flag: -frontend.grpc-client-config.tls-key-path
846+
[tls_key_path: <string> | default = ""]
847+
848+
# Path to the CA certificates file to validate server certificate against. If
849+
# not set, the host's root CA certificates are used.
850+
# CLI flag: -frontend.grpc-client-config.tls-ca-path
851+
[tls_ca_path: <string> | default = ""]
852+
853+
# Skip validating server certificate.
854+
# CLI flag: -frontend.grpc-client-config.tls-insecure-skip-verify
855+
[tls_insecure_skip_verify: <boolean> | default = false]
856+
857+
# Name of network interface to read address from. This address is sent to
858+
# query-scheduler and querier, which uses it to send the query response back to
859+
# query-frontend.
860+
# CLI flag: -frontend.instance-interface-names
861+
[instance_interface_names: <list of string> | default = [eth0 en0]]
862+
765863
# Compress HTTP responses.
766864
# CLI flag: -querier.compress-http-responses
767865
[compress_responses: <boolean> | default = false]
768866
769867
# URL of downstream Prometheus.
770868
# CLI flag: -frontend.downstream-url
771869
[downstream_url: <string> | default = ""]
772-
773-
# Max body size for downstream prometheus.
774-
# CLI flag: -frontend.max-body-size
775-
[max_body_size: <int> | default = 10485760]
776-
777-
# Log queries that are slower than the specified duration. Set to 0 to disable.
778-
# Set to < 0 to enable on all queries.
779-
# CLI flag: -frontend.log-queries-longer-than
780-
[log_queries_longer_than: <duration> | default = 0s]
781870
```
782871

783872
### `query_range_config`
@@ -2455,7 +2544,10 @@ grpc_client_config:
24552544
The `frontend_worker_config` configures the worker - running within the Cortex querier - picking up and executing queries enqueued by the query-frontend.
24562545

24572546
```yaml
2458-
# Address of query frontend service, in host:port format.
2547+
# Address of query frontend service, in host:port format. If
2548+
# -querier.scheduler-address is set as well, querier will use scheduler instead.
2549+
# If neither -querier.frontend-address or -querier.scheduler-address is set,
2550+
# queries must arrive via HTTP endpoint.
24592551
# CLI flag: -querier.frontend-address
24602552
[frontend_address: <string> | default = ""]
24612553
@@ -2539,6 +2631,17 @@ grpc_client_config:
25392631
# Skip validating server certificate.
25402632
# CLI flag: -querier.frontend-client.tls-insecure-skip-verify
25412633
[tls_insecure_skip_verify: <boolean> | default = false]
2634+
2635+
# Hostname (and port) of scheduler that querier will periodically resolve,
2636+
# connect to and receive queries from. If set, takes precedence over
2637+
# -querier.frontend-address.
2638+
# CLI flag: -querier.scheduler-address
2639+
[scheduler_address: <string> | default = ""]
2640+
2641+
# How often to resolve the scheduler-address, in order to look for new
2642+
# query-scheduler instances.
2643+
# CLI flag: -querier.scheduler-dns-lookup-period
2644+
[scheduler_dns_lookup_period: <duration> | default = 10s]
25422645
```
25432646

25442647
### `etcd_config`
@@ -2904,10 +3007,11 @@ The `limits_config` configures default and per-tenant limits imposed by Cortex s
29043007
29053008
# Maximum number of queriers that can handle requests for a single tenant. If
29063009
# set to 0 or value higher than number of available queriers, *all* queriers
2907-
# will handle requests for the tenant. Each frontend will select the same set of
2908-
# queriers for the same tenant (given that all queriers are connected to all
2909-
# frontends). This option only works with queriers connecting to the
2910-
# query-frontend, not when using downstream URL.
3010+
# will handle requests for the tenant. Each frontend (or query-scheduler, if
3011+
# used) will select the same set of queriers for the same tenant (given that all
3012+
# queriers are connected to all frontends / query-schedulers). This option only
3013+
# works with queriers connecting to the query-frontend / query-scheduler, not
3014+
# when using downstream URL.
29113015
# CLI flag: -frontend.max-queriers-per-tenant
29123016
[max_queriers_per_tenant: <int> | default = 0]
29133017

docs/configuration/v1-guarantees.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,3 +55,4 @@ Currently experimental features are:
5555
- Blocksconvert tools
5656
- OpenStack Swift storage support.
5757
- Metric relabeling in the distributor.
58+
- Scalable query-frontend (when using query-scheduler)

docs/guides/shuffle-sharding.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ _The shard size can be overridden on a per-tenant basis in the limits overrides
8080

8181
By default all Cortex queriers can execute received queries for given tenant.
8282

83-
When shuffle sharding is **enabled** by setting `-frontend.max-queriers-per-tenant` (or its respective YAML config option) to a value higher than 0 and lower than the number of available queriers, only specified number of queriers will execute queries for single tenant. Note that this distribution happens in query-frontend. When not using query-frontend, this option is not available.
83+
When shuffle sharding is **enabled** by setting `-frontend.max-queriers-per-tenant` (or its respective YAML config option) to a value higher than 0 and lower than the number of available queriers, only specified number of queriers will execute queries for single tenant. Note that this distribution happens in query-frontend, or query-scheduler if used. When using query-scheduler, `-frontend.max-queriers-per-tenant` option must be set for query-scheduler component. When not using query-frontend (with or without scheduler), this option is not available.
8484

8585
_The maximum number of queriers can be overridden on a per-tenant basis in the limits overrides configuration._
8686

pkg/api/api.go

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ import (
2222
"github.com/cortexproject/cortex/pkg/ingester/client"
2323
"github.com/cortexproject/cortex/pkg/querier"
2424
"github.com/cortexproject/cortex/pkg/querier/frontend"
25+
"github.com/cortexproject/cortex/pkg/querier/frontend2"
2526
"github.com/cortexproject/cortex/pkg/ring"
2627
"github.com/cortexproject/cortex/pkg/ruler"
2728
"github.com/cortexproject/cortex/pkg/storegateway"
@@ -308,9 +309,21 @@ func (a *API) RegisterQueryAPI(handler http.Handler) {
308309
// RegisterQueryFrontend registers the Prometheus routes supported by the
309310
// Cortex querier service. Currently this can not be registered simultaneously
310311
// with the Querier.
311-
func (a *API) RegisterQueryFrontend(f *frontend.Frontend) {
312+
func (a *API) RegisterQueryFrontendHandler(h http.Handler) {
313+
a.RegisterQueryAPI(h)
314+
}
315+
316+
func (a *API) RegisterQueryFrontend1(f *frontend.Frontend) {
312317
frontend.RegisterFrontendServer(a.server.GRPC, f)
313-
a.RegisterQueryAPI(f.Handler())
318+
}
319+
320+
func (a *API) RegisterQueryFrontend2(f *frontend2.Frontend2) {
321+
frontend2.RegisterFrontendForQuerierServer(a.server.GRPC, f)
322+
}
323+
324+
func (a *API) RegisterQueryScheduler(f *frontend2.Scheduler) {
325+
frontend2.RegisterSchedulerForFrontendServer(a.server.GRPC, f)
326+
frontend2.RegisterSchedulerForQuerierServer(a.server.GRPC, f)
314327
}
315328

316329
// RegisterServiceMapHandler registers the Cortex structs service handler

0 commit comments

Comments
 (0)