Skip to content

[7.x Backport] Add Cumulative Cardinality agg (and Data Science plugin) #45990

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 26, 2019

Conversation

polyfractal
Copy link
Contributor

Backport of #43661. The de-Guicing of xpack feature usage made the backport non-negligible, so I'm putting this up for some rounds of CI just to be safe :)

No need for review.

This adds a pipeline aggregation that calculates the cumulative
cardinality of a field.  It does this by iteratively merging in the
HLL sketch from consecutive buckets and emitting the cardinality up
to that point.

This is useful for things like finding the total "new" users that have
visited a website (as opposed to "repeat" visitors).

This is a Basic+ aggregation and adds a new Data Science plugin
to house it and future advanced analytics/data science aggregations.
@polyfractal polyfractal merged commit 943a016 into elastic:7.x Aug 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant