Skip to content

[RFC] New search streaming API #18725

@rishabhmaurya

Description

@rishabhmaurya

Is your feature request related to a problem? Please describe

With ongoing work on node-node streaming transport (#18425, #18722 & #18424 ), which will soon be available for use as experimental feature, the very first possibility it opens is to search in a streaming manner.

Describe the solution you'd like

There are 2 chain of thoughts -

  1. Skip scoring
    The idea is get first batch of results as soon as possible. A lot of time users don't care about scoring or aggregation, which requires coordinator node to wait for results from shard before generating a response to be sent to the client. Thus, a new streaming search API can stream result, by avoiding any reduce/TopN collection, as soon as first set of hits are available. This could be extremely fast way to search and get early results.
  • New search API implementation -
    • Skip any reduce/scoring logic both on data and coordinator and data node
    • Limit the search request params to the ones which makes sense for this new API
  1. With scoring
    This may not give us early results, but coordinator can still stream results to the client. We are looking into ideas on how to compute topN in a efficient manner when hits can be streamed from data to the coordinator node with scores. Should we stream all hits to the coordinator and let it compute topN. Can we do some pruning of non-competitive hits? Can we make use of concurrent segment search combines with streaming at data nodes?

Please share your thoughts.

Related component

Search:Performance

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

Search:PerformancediscussIssues intended to help drive brainstorming and decision makingenhancementEnhancement or improvement to existing feature or request

Type

No type

Projects

Status

🆕 New

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions