Add streaming search with configurable scoring modes#19176
Conversation
|
❌ Gradle check result for 5554606: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 5554606: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 5554606: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
9f6e129 to
5061f09
Compare
Signed-off-by: Atri Sharma <atrisharma@Atris-Mac-Studio.local>
Signed-off-by: Atri Sharma <atrisharma@Atris-Mac-Studio.local>
Signed-off-by: Atri Sharma <atrisharma@Atris-Mac-Studio.local>
Signed-off-by: Atri Sharma <atrisharma@Atris-Mac-Studio.local>
Signed-off-by: Atri Sharma <atrisharma@Atris-Mac-Studio.local>
Signed-off-by: Atri Sharma <atrisharma@Atris-Mac-Studio.local>
Signed-off-by: Atri Sharma <atrisharma@Atris-Mac-Studio.local>
Signed-off-by: Atri Sharma <atrisharma@Atris-Mac-Studio.local>
Signed-off-by: Atri Sharma <atrisharma@Atris-Mac-Studio.local>
Signed-off-by: Atri Sharma <atrisharma@Atris-Mac-Studio.local>
Signed-off-by: Atri Sharma <atrisharma@Atris-Mac-Studio.local>
Signed-off-by: Atri Sharma <atrisharma@Atris-Mac-Studio.local>
Signed-off-by: Atri Sharma <atrisharma@Atris-Mac-Studio.local>
|
Persistent review updated to latest commit 6f10649 |
|
❌ Gradle check result for 6f10649: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
Persistent review updated to latest commit 6d030e3 |
|
❌ Gradle check result for 6d030e3: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
@rishabhmaurya ready for next order of review please |
|
Persistent review updated to latest commit 6d030e3 |
|
❌ Gradle check result for 6d030e3: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Summary
Implement streaming search on the coordinator, emitting early partial results from the query phase with optional scoring. The change introduces request flags and a mode selector, integrates streaming into the existing SearchAction
path, and adds a reproducible TTFB benchmark. When streaming is not used, behavior is unchanged.
Motivation
• Reduce time-to-first-byte (TTFB) at the coordinator by not waiting for all shards to complete query phase before starting fetch-eligible work.
• Provide mode-specific controls for batching and scoring, with safe defaults.
• Keep backward compatibility on the transport wire and preserve REST semantics.
Design and Scope
• Request flags and mode
• SearchRequest gains version-gated fields (V_3_3_0): streamingScoring (boolean) and streamingSearchMode (string).
• REST: stream=true enables streaming; optional stream_scoring_mode and streaming_mode select behavior.
• No change to default behavior; streaming is opt‑in.
• Coordinator streaming
• Streaming is integrated into TransportSearchAction (SearchAction). No separate transport action is required.
• SearchPhaseController.newSearchPhaseResults(...) returns either the existing QueryPhaseResultConsumer or a StreamQueryPhaseResultConsumer based on the request mode.
• StreamQueryPhaseResultConsumer controls partial reduce cadence via mode-specific multipliers and emits TopDocs-aware partials to the progress listener.
• Partial reduce notifications
• SearchProgressListener gains a TopDocs-aware hook with a compatibility fallback:
• onPartialReduceWithTopDocs(…) → defaults to onPartialReduce(…).
• notifyPartialReduceWithTopDocs(…) invokes the hook safely.
• Existing listeners are unaffected.
• Query execution
• For streaming queries, the QueryPhase routes to streaming collector contexts based on StreamingSearchMode:
• NO_SCORING: unsorted documents, fastest emission.
• SCORED_UNSORTED: scored documents without sort.
• SCORED_SORTED: scored, sorted via Lucene’s top-N collectors.
• CONFIDENCE_BASED: early emission guided by simple Hoeffding-style bounds.
• Collector batch size is bounded and read via SearchContext.getStreamingBatchSize(); partial batches are emitted to the stream channel when available.
• Transport integration
• Both the classic and stream transport handlers are registered:
• Classic: SearchTransportService.registerRequestHandler(…).
• Stream (if available): StreamSearchTransportService.registerStreamRequestHandler(…).
• The streaming transport path is selected only for streaming requests and used thread pools are chosen accordingly.
Settings and Controls
• Dynamic cluster settings for streaming are added (StreamingSearchSettings, node-scoped, dynamic). Examples:
• search.streaming.batch_size
• Mode-specific reduce multipliers, emission interval, and minimal doc thresholds
• Circuit breaker and limits for buffering in streaming code paths
• Defaults are conservative. The feature remains opt-in via request flags; settings do not change behavior unless the request is streaming.
Wire Compatibility and API
• Transport wire BWC
• New SearchRequest and ShardSearchRequest fields are gated by Version.V_3_3_0 on read/write. Older peers neither write nor read these fields.
• Public API
• No breaking changes to REST endpoints.
• SearchProgressListener adds new methods with safe defaults; existing code continues to compile and run.
Tests and Benchmark
• Unit tests:
• Stream consumer batch sizing and dynamic settings effects.
• Hoeffding bounds behavior.
• Integration tests:
• Basic streaming search workflows.
• Streaming aggregations with and without sub-aggregations.
• Mode coverage (NO_SCORING, SCORED_UNSORTED, SCORED_SORTED, CONFIDENCE_BASED).
• Benchmark:
• StreamingPerformanceBenchmarkTests: measures coordinator-side TTFB (time to first partial reduce) vs. classic full reduce for a large query.
• Logger-only reporting; no REST streaming is introduced.
Non-Goals / Limitations
• This change does not implement HTTP/REST streaming of partial responses.
• The SearchResponse partial/sequence metadata used internally by the streaming listener is not serialized on the wire and does not alter REST payloads.
• Confidence-based mode uses a conservative and simple bound; it is adequate for early gating but not a full ranking stability analysis.
Backward Compatibility and Risk
• Default behavior unchanged unless streaming flags are provided.
• Wire BWC ensured via version gating; JApiCmp passes.
• Aggregation partial reductions are unaffected; for TopDocs partials we call the new TopDocs-aware hook, otherwise we continue to notify via the existing method.
Operational Notes
• Streaming is disabled by default and must be explicitly requested with stream=true (REST) or by setting SearchRequest flags programmatically.
• Mode selection allows tuning for latency vs. coordination cost.
• Dynamic settings enable safe runtime tuning if necessary.
If reviewers prefer, I can split the settings and the confidence-based collector into a follow-up to further reduce the initial surface.
Summary by CodeRabbit
New Features
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.