Skip to content

Commit 9cf31fe

Browse files
gouthamvepracucci
andauthored
Remove support schema flags, only use config file. (#2221)
* Remove support schema flags, only use config file. Also rename the schema config file flag to something sane. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Update docs to use the schema file. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Deprecate not remove the config flag. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Make integration tests pass? Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Remove support schema flags, only use config file. Also rename the schema config file flag to something sane. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Update docs to use the schema file. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Deprecate not remove the config flag. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> * Update docs/configuration/schema-config-reference.md Signed-off-by: Marco Pracucci <marco@pracucci.com> * Fixed schema config doc Signed-off-by: Marco Pracucci <marco@pracucci.com> * Fixes after rebase Signed-off-by: Marco Pracucci <marco@pracucci.com> * Remove duplicated entry from CHANGELOG Signed-off-by: Marco Pracucci <marco@pracucci.com> Co-authored-by: Marco Pracucci <marco@pracucci.com>
1 parent 04d2abe commit 9cf31fe

14 files changed

+221
-178
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22

33
## master / unreleased
44

5+
* [CHANGE] Removed support for flags to configure schema. Further, the flag for specifying the config file (`-config-yaml`) has been deprecated. Please use `schema-config-file`. See https://cortexmetrics.io/docs/configuration/schema-configuration/ for more details on how to configure the schema using the YAML file. #2221
56
* [CHANGE] The frontend http server will now send 502 in case of deadline exceeded and 499 if the user requested cancellation. #2156
67
* [CHANGE] Config file changed to remove top level `config_store` field in favor of a nested `configdb` field. #2125
78
* [CHANGE] We now enforce queries to be up to `-querier.max-query-into-future` into the future (defaults to 10m). #1929
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
---
2+
title: "Schema Configuration"
3+
linkTitle: "Schema Configuration"
4+
weight: 1
5+
slug: schema-configuration
6+
---
7+
8+
Cortex uses a NoSQL Store to store its index and optionally an Object store to store its chunks. Cortex has overtime evolved its schema to be more optimal and better fit the use cases and query patterns that arose.
9+
10+
Currently there are 9 schemas that are used in production but we recommend running with `v9` schema when possible. You can move from one schema to another if a new schema fits your purpose better, but you still need to configure Cortex to make sure it can read the old data in the old schemas.
11+
12+
You can configure the schemas using a YAML config file, that you can point to using the `-schema-config-file` flag. It has the following YAML spec:
13+
14+
```yaml
15+
configs: []<period_config>
16+
```
17+
18+
Where `period_config` is
19+
```
20+
# In YYYY-MM-DD format, for example: 2020-03-01.
21+
from: <string>
22+
# The index client to use, valid options: aws-dynamo, bigtable, bigtable-hashed, cassandra, boltdb.
23+
store: <string>
24+
# The object client to use. If none is specified, `store` is used for storing chunks as well. Valid options: s3, aws-dynamo, bigtable, bigtable-hashed, gcs, cassandra, filesystem.
25+
object_store: <string>
26+
# The schema version to use. Valid ones are v1, v2, v3,... v6, v9, v10, v11. Recommended for production: v9.
27+
schema: <string>
28+
index: <periodic_table_config>
29+
chunks: <periodic_table_config>
30+
```
31+
32+
Where `periodic_table_config` is
33+
```
34+
# The prefix to use for the tables.
35+
prefix: <string>
36+
# We typically run Cortex with new tables every week to keep the index size low and to make retention easier. This sets the period at which new tables are created and used. Typically 168h (1week).
37+
period: <duration>
38+
# The tags that can be set on the dynamo table.
39+
tags: <map[string]string>
40+
```
41+
42+
Now an example of this file (also something recommended when starting out) is:
43+
```
44+
configs:
45+
- from: "2020-03-01" # Or typically a week before the Cortex cluster was created.
46+
schema: v9
47+
index:
48+
period: 168h
49+
prefix: cortex_index_
50+
# Chunks section is optional and required only if you're not using a
51+
# separate object store.
52+
chunks:
53+
period: 168h
54+
prefix: cortex_chunks
55+
store: aws-dynamo/bigtable-hashed/cassandra/boltdb
56+
object_store: <above options>/s3/gcs/azure/filesystem
57+
```
58+
59+
An example of an advanced schema file with a lot of changes:
60+
```
61+
configs:
62+
# Starting from 2018-08-23 Cortex should store chunks and indexes
63+
# on Google BigTable using weekly periodic tables. The chunks table
64+
# names will be prefixed with "dev_chunks_", while index tables will be
65+
# prefixed with "dev_index_".
66+
- from: "2018-08-23"
67+
schema: v9
68+
chunks:
69+
period: 168h0m0s
70+
prefix: dev_chunks_
71+
index:
72+
period: 168h0m0s
73+
prefix: dev_index_
74+
store: gcp-columnkey
75+
76+
# Starting 2018-02-13 we moved from BigTable to GCS for storing the chunks.
77+
- from: "2019-02-13"
78+
schema: v9
79+
chunks:
80+
period: 168h
81+
prefix: dev_chunks_
82+
index:
83+
period: 168h
84+
prefix: dev_index_
85+
object_store: gcs
86+
store: gcp-columnkey
87+
88+
# Starting 2019-02-24 we moved our index from bigtable-columnkey to bigtable-hashed
89+
# which improves the distribution of keys.
90+
- from: "2019-02-24"
91+
schema: v9
92+
chunks:
93+
period: 168h
94+
prefix: dev_chunks_
95+
index:
96+
period: 168h
97+
prefix: dev_index_
98+
object_store: gcs
99+
store: bigtable-hashed
100+
101+
# Starting 2019-03-05 we moved from v9 schema to v10 schema.
102+
- from: "2019-03-05"
103+
schema: v10
104+
chunks:
105+
period: 168h
106+
prefix: dev_chunks_
107+
index:
108+
period: 168h
109+
prefix: dev_index_
110+
object_store: gcs
111+
store: bigtable-hashed
112+
```
113+
114+
Note how we started out with v9 and just Bigtable, but later migrated to GCS as the object store, finally moving to v10. This is a complex schema file showing several changes changes over the time, while a typical schema config file usually has just one or two schema versions.
115+
116+
### Migrating from flags to schema file
117+
118+
Legacy versions of Cortex did support the ability to configure schema via flags. If you are still using flags, you need to migrate your configuration from flags to the config file.
119+
120+
If you're using:
121+
122+
* `chunk.storage-client`: then set the corresponding `object_store` field correctly in the schema file.
123+
* `dynamodb.daily-buckets-from`: then set the corresponding `from` date with `v2` schema.
124+
* `dynamodb.base64-buckets-from`: then set the corresponding `from` date with `v3` schema.
125+
* `dynamodb.v{4,5,6,9}-schema-from`: then set the corresponding `from` date with schema `v{4,5,6,9}`
126+
* `bigtable.column-key-from`: then set the corresponding `from` date and use the `store` as `bigtable-columnkey`.
127+
* `dynamodb.use-periodic-tables`: then set the right `index` and `chunk` fields with corresponding values from `dynamodb.periodic-table.{prefix, period, tag}` and `dynamodb.chunk-table.{prefix, period, tag}` flags. Note that the default period is 7 days, so please set the `period` as `168h` in the config file if none is set in the flags.

docs/guides/aws-specific.md

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -51,10 +51,8 @@ example set of command-line parameters from a fairly modest install:
5151
-metrics.url=http://prometheus.monitoring.svc.cluster.local./api/prom/
5252
-metrics.target-queue-length=100000
5353
-dynamodb.url=dynamodb://us-east-1/
54-
-dynamodb.use-periodic-tables=true
54+
-schema-config-file=/etc/schema.yaml
5555
56-
-dynamodb.periodic-table.prefix=cortex_index_
57-
-dynamodb.periodic-table.from=2019-05-02
5856
-dynamodb.periodic-table.write-throughput=1000
5957
-dynamodb.periodic-table.write-throughput.scale.enabled=true
6058
-dynamodb.periodic-table.write-throughput.scale.min-capacity=200
@@ -64,8 +62,6 @@ example set of command-line parameters from a fairly modest install:
6462
-dynamodb.periodic-table.read-throughput=300
6563
-dynamodb.periodic-table.tag=product_area=cortex
6664
67-
-dynamodb.chunk-table.from=2019-05-02
68-
-dynamodb.chunk-table.prefix=cortex_data_
6965
-dynamodb.chunk-table.write-throughput=800
7066
-dynamodb.chunk-table.write-throughput.scale.enabled=true
7167
-dynamodb.chunk-table.write-throughput.scale.min-capacity=200

docs/guides/running.md

Lines changed: 1 addition & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -55,22 +55,7 @@ redundancy or less, depending on your risk appetite.
5555

5656
### Schema
5757

58-
#### Schema periodic table
59-
60-
The periodic table from argument (`-dynamodb.periodic-table.from=<date>` if
61-
using command line flags, the `from` field for the first schema entry if using
62-
YAML) should be set to the date the oldest metrics you will be sending to
63-
Cortex. Generally that means set it to the date you are first deploying this
64-
instance. If you use an example date from years ago table-manager will create
65-
hundreds of tables. You can also avoid creating too many tables by setting a
66-
reasonable retention in the table-manager
67-
(`-table-manager.retention-period=<duration>`).
68-
69-
#### Schema version
70-
71-
Choose schema version 9 in most cases; version 10 if you expect
72-
hundreds of thousands of timeseries under a single name. Anything
73-
older than v9 is much less efficient.
58+
See [schema config file docs](../configuration/schema-config-reference.md).
7459

7560
### Chunk encoding
7661

integration/backward_compatibility_test.go

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,13 +31,18 @@ func TestBackwardCompatibilityWithChunksStorage(t *testing.T) {
3131
consul := e2edb.NewConsul()
3232
require.NoError(t, s.StartAndWaitReady(dynamo, consul))
3333

34+
flagsForOldImage := mergeFlags(ChunksStorageFlags, map[string]string{
35+
"-schema-config-file": "",
36+
"-config-yaml": ChunksStorageFlags["-schema-config-file"],
37+
})
38+
3439
// Start Cortex components (ingester running on previous version).
3540
require.NoError(t, writeFileToSharedDir(s, cortexSchemaConfigFile, []byte(cortexSchemaConfigYaml)))
36-
tableManager := e2ecortex.NewTableManager("table-manager", ChunksStorageFlags, previousVersionImage)
41+
tableManager := e2ecortex.NewTableManager("table-manager", flagsForOldImage, previousVersionImage)
3742
// Old table-manager doesn't expose a readiness probe, so we just check if the / returns 404
3843
tableManager.SetReadinessProbe(e2e.NewReadinessProbe(tableManager.HTTPPort(), "/", 404))
39-
ingester1 := e2ecortex.NewIngester("ingester-1", consul.NetworkHTTPEndpoint(), ChunksStorageFlags, "")
40-
distributor := e2ecortex.NewDistributor("distributor", consul.NetworkHTTPEndpoint(), ChunksStorageFlags, "")
44+
ingester1 := e2ecortex.NewIngester("ingester-1", consul.NetworkHTTPEndpoint(), flagsForOldImage, "")
45+
distributor := e2ecortex.NewDistributor("distributor", consul.NetworkHTTPEndpoint(), flagsForOldImage, "")
4146
// Old ring didn't have /ready probe, use /ring instead.
4247
distributor.SetReadinessProbe(e2e.NewReadinessProbe(distributor.HTTPPort(), "/ring", 200))
4348
require.NoError(t, s.StartAndWaitReady(distributor, ingester1, tableManager))
@@ -72,7 +77,11 @@ func TestBackwardCompatibilityWithChunksStorage(t *testing.T) {
7277

7378
// Query the new ingester both with the old and the new querier.
7479
for _, image := range []string{previousVersionImage, ""} {
75-
querier := e2ecortex.NewQuerier("querier", consul.NetworkHTTPEndpoint(), ChunksStorageFlags, image)
80+
flags := ChunksStorageFlags
81+
if image == previousVersionImage {
82+
flags = flagsForOldImage
83+
}
84+
querier := e2ecortex.NewQuerier("querier", consul.NetworkHTTPEndpoint(), flags, image)
7685
require.NoError(t, s.StartAndWaitReady(querier))
7786

7887
// Wait until the querier has updated the ring.

integration/configs.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ tsdb:
9090
ChunksStorageFlags = map[string]string{
9191
"-dynamodb.url": fmt.Sprintf("dynamodb://u:p@%s-dynamodb.:8000", networkName),
9292
"-dynamodb.poll-interval": "1m",
93-
"-config-yaml": filepath.Join(e2e.ContainerSharedDir, cortexSchemaConfigFile),
93+
"-schema-config-file": filepath.Join(e2e.ContainerSharedDir, cortexSchemaConfigFile),
9494
"-table-manager.retention-period": "168h",
9595
}
9696

integration/e2e/util.go

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,12 @@ func MergeFlags(inputs ...map[string]string) map[string]string {
3030
}
3131
}
3232

33+
for k, v := range output {
34+
if v == "" {
35+
delete(output, k)
36+
}
37+
}
38+
3339
return output
3440
}
3541

k8s/ingester-dep.yaml

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -45,16 +45,8 @@ spec:
4545
- -ingester.claim-on-rollout=true
4646
- -consul.hostname=consul.default.svc.cluster.local:8500
4747
- -s3.url=s3://abc:123@s3.default.svc.cluster.local:4569
48-
- -dynamodb.original-table-name=cortex
4948
- -dynamodb.url=dynamodb://user:pass@dynamodb.default.svc.cluster.local:8000
50-
- -dynamodb.periodic-table.prefix=cortex_weekly_
51-
- -dynamodb.periodic-table.from=2017-01-06
52-
- -dynamodb.daily-buckets-from=2017-01-10
53-
- -dynamodb.base64-buckets-from=2017-01-17
54-
- -dynamodb.v4-schema-from=2017-02-05
55-
- -dynamodb.v5-schema-from=2017-02-22
56-
- -dynamodb.v6-schema-from=2017-03-19
57-
- -dynamodb.chunk-table.from=2017-04-17
49+
- -schema-config-file=/etc/cortex/schema.yaml
5850
- -memcached.hostname=memcached.default.svc.cluster.local
5951
- -memcached.timeout=100ms
6052
- -memcached.service=memcached
@@ -66,3 +58,10 @@ spec:
6658
port: 80
6759
initialDelaySeconds: 15
6860
timeoutSeconds: 1
61+
volumeMounts:
62+
- name: config-volume
63+
mountPath: /etc/cortex
64+
volumes:
65+
- name: config-volume
66+
configMap:
67+
name: schema-config

k8s/querier-dep.yaml

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -22,20 +22,19 @@ spec:
2222
- -server.http-listen-port=80
2323
- -consul.hostname=consul.default.svc.cluster.local:8500
2424
- -s3.url=s3://abc:123@s3.default.svc.cluster.local:4569
25-
- -querier.frontend-address=query-frontend.default.svc.cluster.local:9095
26-
- -dynamodb.original-table-name=cortex
2725
- -dynamodb.url=dynamodb://user:pass@dynamodb.default.svc.cluster.local:8000
28-
- -dynamodb.periodic-table.prefix=cortex_weekly_
29-
- -dynamodb.periodic-table.from=2017-01-06
30-
- -dynamodb.daily-buckets-from=2017-01-10
31-
- -dynamodb.base64-buckets-from=2017-01-17
32-
- -dynamodb.v4-schema-from=2017-02-05
33-
- -dynamodb.v5-schema-from=2017-02-22
34-
- -dynamodb.v6-schema-from=2017-03-19
35-
- -dynamodb.chunk-table.from=2017-04-17
26+
- -schema-config-file=/etc/cortex/schema.yaml
27+
- -querier.frontend-address=query-frontend.default.svc.cluster.local:9095
3628
- -memcached.hostname=memcached.default.svc.cluster.local
3729
- -memcached.timeout=100ms
3830
- -memcached.service=memcached
3931
- -distributor.replication-factor=1
4032
ports:
4133
- containerPort: 80
34+
volumeMounts:
35+
- name: config-volume
36+
mountPath: /etc/cortex
37+
volumes:
38+
- name: config-volume
39+
configMap:
40+
name: schema-config

k8s/ruler-dep.yaml

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -25,19 +25,18 @@ spec:
2525
- -ruler.alertmanager-url=http://alertmanager.default.svc.cluster.local/api/prom/alertmanager/
2626
- -consul.hostname=consul.default.svc.cluster.local:8500
2727
- -s3.url=s3://abc:123@default.svc.cluster.local:4569/s3
28-
- -dynamodb.original-table-name=cortex
2928
- -dynamodb.url=dynamodb://user:pass@dynamodb.default.svc.cluster.local:8000
30-
- -dynamodb.periodic-table.prefix=cortex_weekly_
31-
- -dynamodb.periodic-table.from=2017-01-06
32-
- -dynamodb.daily-buckets-from=2017-01-10
33-
- -dynamodb.base64-buckets-from=2017-01-17
34-
- -dynamodb.v4-schema-from=2017-02-05
35-
- -dynamodb.v5-schema-from=2017-02-22
36-
- -dynamodb.v6-schema-from=2017-03-19
37-
- -dynamodb.chunk-table.from=2017-04-17
29+
- -schema-config-file=/etc/cortex/schema.yaml
3830
- -memcached.hostname=memcached.default.svc.cluster.local
3931
- -memcached.timeout=100ms
4032
- -memcached.service=memcached
4133
- -distributor.replication-factor=1
4234
ports:
4335
- containerPort: 80
36+
volumeMounts:
37+
- name: config-volume
38+
mountPath: /etc/cortex
39+
volumes:
40+
- name: config-volume
41+
configMap:
42+
name: schema-config

0 commit comments

Comments
 (0)