Remove dimension limit for time series data streams

### Description

Currently, there are several limits around the number of dimensions:
* Dimension keys have a hard limit of 512b. Documents are rejected if this limit is reached.
* Dimension values have a hard limit of 1024b. Documents are rejected if this limit is reached.
* The _tsid consists of all dimension keys and values and has a hard limit of 32kb. Documents are rejected if this limit is reached.
* To avoid rejecting documents at ingest time due to the hard limit on the _tsid, per default, only 16 fields can be marked as a dimension in the mapping. The limit can be increased with an [index setting](https://www.elastic.co/guide/en/elasticsearch/reference/8.6/tsds-index-settings.html#index-mapping-dimension-fields-limit), however this can lead to document rejections if the hard limit for _tsid is reached.

This limit makes it difficult to adopt time series data streams for a couple of reasons:

* Before onboarding a metric, integration developers need to carefully think about whether a field is a dimension or just a metadata/tag.
 This isn't always trivial as some metadata is only available in certain conditions (when the application is running on k8s or on cloud). If we over-index and mark too many fields as dimensions, we risk hitting the limit. If we mark too few fields as dimensions it leads to document rejection when trying to index multiple documents with the same timestamp that end up having the same `_tsid`. It's a fairly labor-intensive and error-prone process to properly mark the right set of fields as dimensions.
* It prevents the ingestion of ad-hoc metrics that have an unknown up-front schema.
 We'll want to provide users of metric libraries like Micrometer or the OpenTelemetry metrics SDK with an easy way to add new metrics, without previously having to change the schema in ES. Metric libraries usually don't differ between dimensions and metadata. There's typically only a way to set the metric name, attributes (aka labels, tags, dimensions), and a value. So we'll need to map all dynamic labels as dimensions. The metric limit gets in the way of that.
* Other TSDBs don't have such a limit.
 This will make it harder to move from other TSDBs to Elasticsearch.

I don't want to go too much into implementation details here but we had discussions about potentially turning the `_tsid` into a hash which would enable to completely remove any limits on the number of dimensions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove dimension limit for time series data streams #93564

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Remove dimension limit for time series data streams #93564

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions