Ideas for aggregation performance improvements

This is a general meta issue to capture the intent of trying to leverage index structures more often in aggregations.  Today, we have some simple optimizations that will "short circuit" agg execution by consulting the BKD tree (min/max aggs for example), and [recently](https://github.com/elastic/elasticsearch/pull/63643) some substantial work to rewrite date_histograms into ranges/filters.

In both cases, these optimizations can greatly accelerate the "hot path" by looking up data in the index, rather than iterating over each document and polling the DV.  We think there are probably a number of such cases, where we can accelerate certain scenarios or arrangements of aggs by reusing data in the index


Related: 
- https://github.com/elastic/elasticsearch/pull/64662
- https://github.com/elastic/elasticsearch/pull/63643
- Merge the implementation of `filter` into `filters` so it can share in all the performance improvements on `filters`.
- Merge "filter-by-filter" execution with parent aggregations if possible. This'd give huge speed up if `filters` is under an agg that can run in filter-by-filter mode. It'd be fairly helpful when two "filter-by-filter" compatible aggs are nested in one another.
- `filters` aggregations on `range` queries without children could use the BKD index to count matches instead of enumerating all matches
- `cardinality` aggregations on `match_all` queries could build the HLL++ object from the terms dictionary instead of collecting all matches
- `percentiles` aggregations on `match_all` queries could build the HDR histogram from the BKD tree: for leaf nodes where the min and max value would be on the same bucket, we wouldn't need to collect all individual values one by one.
- https://github.com/elastic/elasticsearch/issues/90261
- https://github.com/elastic/elasticsearch/issues/88185

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ideas for aggregation performance improvements #65019

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Ideas for aggregation performance improvements #65019

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions