Skip to content

[Feature Request] Speed up percentile aggregation by switching implementation #18122

@peteralfonsi

Description

@peteralfonsi

Is your feature request related to a problem? Please describe

The percentiles aggregation can be very slow. We rely on the t-digest library to get approximate percentiles. While poking around in the code I noticed we use their AVLTreeDigest implementation, but the recommended one is now MergingDigest. It looks like OpenSearch's TDigestState was last meaningfully modified in March 2017, but this new implementation was introduced after that in April 2017, which explains why we aren't already using it.

The comments claim this implementation is both faster and also uses "much less than half" of the memory of AVLTreeDigest. I couldn't find any actual numbers for speed posted online but I did run some benchmarks with OpenSearch that look good.

Describe the solution you'd like

We should switch to the new implementation. Since these extend the same abstract class it would be a drag-and-drop change.

I benchmarked this change on http_logs which has 247M docs. I did it for the "@timestamp" field (high cardinality) and the "status" field (low cardinality since it's an HTTP status code). The speedup was especially large for status:

Field Baseline latency (ms) Modififed latency (ms)
timestamp 13,085 6,293
status 196,794 6,212

Related component

Search:Performance

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

✅ Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions