Skip to content

Can date_histograms better take advantage of data locality? #90261

Open
@jpountz

Description

@jpountz

Description

Date histograms are one of Elasticsearch's most used aggregations. Given the way we index into data streams, it's quite likely for documents with similar @timestamps to be clustered together.

Could we take advantage of this to speed up date histograms (or effectively ranges, since date histograms often rewrite to ranges) by first checking whether the current doc falls within the same bucket as the previous doc before doing more expensive operations? For instance in the case when a date histogram rewrites to a range, this could help save the binary search on bucket boundaries.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions