Skip to content

[Intra Segment] Measure impact of segment partitioning on OSB workloads #19694

@expani

Description

@expani

Is your feature request related to a problem? Please describe

We have identified different areas that need change for enabling Intra Segment Concurrent Search.

Enabling intra segment concurrent search can cause some queries like PointRangeQuery to perform worse ( See #18854 for more details )

Ensuring those queries continue to perform optimally require changes in Lucene being tracked at apache/lucene#13745

Unless we can consume the changes from Lucene, we can only enable Intra-Segment concurrency for queries and aggregations that would not perform duplicate work or produce incorrect results.

#18879 is a PR which tried doing the same but was abandoned after the discussion moved onto the optimal slicing mechanism.

Describe the solution you'd like

  • Change the slicing mechanism to create partitions of segments in MaxTargetSliceSupplier
  • Introduce Cluster and Index setting similar to Concurrent Segment Search
  • Run Big5 benchmark with the changes. Maybe force merge segments into smaller count for noticing significant impact. Also, use simple aggregations like ( NumericTerms ) which don't perform per segment duplicate work ( like creating global ordinals )
  • Enable for queries and aggregations that don't require additional changes to work with IntraSegmentConcurrentSearch.
  • Ensure any Collectors like TotalHitCount are disabled unless changed to work correctly with IntraSegment.

Related component

Search:Performance

Additional context

Meta Issue #18852

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    🆕 New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions