Skip to content

Explore option of supporting more flexible search types #12316

Closed as not planned
@clintongormley

Description

@clintongormley

Today we have query_then_fetch and query_and_fetch. This imposes a limit on the types of search functionality we can support. For instance, if you want to auto-adjust the bucket interval so that your documents fit neatly into 10 buckets, you first need to determine the min and max values in order to calculate the correct interval (eg see #9572 and #9531).

This requires two round trips:

  • first determine the min/max values
  • calculate the required interval
  • do a second trip to bucket documents per interval

Or to improve term count accuracy in a terms agg, you could:

  • retrieve eg the top 20 terms from each shard
  • choose the top 10 overall
  • do a second trip (if needed) to get accurate counts for all terms

Or to guarantee that you get the top 10 terms overall:

  • first trip retrieves the top 20 terms per shard
  • calculate the overall top 10
  • take the doc count of the 10th term -> 10th_count
  • second trip retrieves all terms that have at least 10th_count / num_shards
  • third trip calculates accurate counts for all the terms returned by the second trip

Multiple search phases would also help with clustering algorithms

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions