Skip to content

Current partition key schema requires that queries must be run against all nodes #174

Open
@mattbostock

Description

@mattbostock

The current partition key schema, in addition to the lack of any centralised index, requires that queries must be run against all nodes in the cluster.

The current schema can be represented as:

<salt>:<bucket_end_time_as_YYYYMMDD>:<metric_name>:[<label_name>,<label_name>...]

Since the label names are often not known at query time, and PromQL allows querying without the metric name, the current schema means that all nodes must be queried in order to ensure that all matches time-series are retrieved.

The partition key schema should be improved to limit the number of nodes that must be queried. There is an inherent tension between limiting the number of nodes that need to be queried and balancing ingestion across as many nodes in the cluster as possible.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions