Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip SortingDigest when merging a large digest in HybridDigest #97099

Merged
merged 2 commits into from
Jun 26, 2023

Conversation

kkrik-es
Copy link
Contributor

@kkrik-es kkrik-es commented Jun 26, 2023

This is a small performance optimization that avoids creating an intermediate SortingDigest when merging a digest with many samples. The current behavior is to keep adding values to SortingDigest until we cross the threshold for switching to MergingDigest, at which point we copy all values from SortingDigest to MergingDigest and release the former.

As a side cleanup, remove the methods for merging a list of digests. They are not used anywhere and can be tricky to get right - the current implementation for HybridDigest is buggy.

This is a small performance optimization that avoids creating an
intermediate SortingDigest when merging a digest tracking many samples.
The current behavior is to keep adding values to SortingDigest until we
cross the threshold for switching to MergingDigest, at which point we
copy all values from SortingDigest to MergingDigest and release the
former.

As a side cleanup, remove the methods for adding a list of digests. It's
not used anywhere and it can be tricky to get right - the current
implementation for HybridDigest is buggy.
@kkrik-es kkrik-es added >enhancement :Analytics/Aggregations Aggregations Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Jun 26, 2023
@kkrik-es kkrik-es self-assigned this Jun 26, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @kkrik-es, I've created a changelog YAML for you.

@kkrik-es kkrik-es marked this pull request as ready for review June 26, 2023 11:09
@kkrik-es kkrik-es requested a review from martijnvg June 26, 2023 11:09
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This feels more like a hybrid tdigest bug fix to me? If so I think we should back port this to 8.9 branch.

@kkrik-es
Copy link
Contributor Author

LGTM

This feels more like a hybrid tdigest bug fix to me? If so I think we should back port this to 8.9 branch.

Agreed.

@kkrik-es kkrik-es added v8.9.0 auto-backport-and-merge v8.9.1 auto-backport Automatically create backport pull requests when merged and removed v8.9.0 auto-backport-and-merge labels Jun 26, 2023
@kkrik-es kkrik-es merged commit d984a14 into elastic:main Jun 26, 2023
kkrik-es added a commit to kkrik-es/elasticsearch that referenced this pull request Jun 26, 2023
…lastic#97099)

* Skip SortingDigest when merging a large digest in HybridDigest.

This is a small performance optimization that avoids creating an
intermediate SortingDigest when merging a digest tracking many samples.
The current behavior is to keep adding values to SortingDigest until we
cross the threshold for switching to MergingDigest, at which point we
copy all values from SortingDigest to MergingDigest and release the
former.

As a side cleanup, remove the methods for adding a list of digests. It's
not used anywhere and it can be tricky to get right - the current
implementation for HybridDigest is buggy.

* Update docs/changelog/97099.yaml
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.9

kkrik-es added a commit that referenced this pull request Jun 26, 2023
…97099) (#97106)

* Skip SortingDigest when merging a large digest in HybridDigest.

This is a small performance optimization that avoids creating an
intermediate SortingDigest when merging a digest tracking many samples.
The current behavior is to keep adding values to SortingDigest until we
cross the threshold for switching to MergingDigest, at which point we
copy all values from SortingDigest to MergingDigest and release the
former.

As a side cleanup, remove the methods for adding a list of digests. It's
not used anywhere and it can be tricky to get right - the current
implementation for HybridDigest is buggy.

* Update docs/changelog/97099.yaml
@kkrik-es kkrik-es deleted the fix/mergingdigest branch June 27, 2023 13:34
@rjernst rjernst added v8.9.0 and removed v8.9.1 labels Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations auto-backport Automatically create backport pull requests when merged >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.9.0 v8.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants