Reintroduce QueryPhaseListener with performance optimizations #18994

atris · 2025-08-08T18:39:27Z

This commit reintroduces the QueryPhaseListener functionality that was reverted
in #18913 due to performance regressions, now with optimizations that eliminate
the overhead.

Changes:

Add QueryPhaseListener interface for plugins to hook into query phase
Add AbstractQueryPhaseSearcher implementing template pattern for listeners
Modify QueryPhaseSearcher interface to support listener registration
Add comprehensive tests for listener functionality

Performance optimizations:

Fast-path in DefaultQueryPhaseSearcher bypasses template pattern when no listeners
Direct hasQueryPhaseListeners() check avoids virtual method calls
Single listener optimization avoids iterator creation
Zero overhead for the common case (no listeners registered)

The original implementation caused 10-15% regression in neural-search because it
forced all searches through the template pattern even without listeners. This
optimized version maintains the extension functionality while eliminating overhead
for plugins that extend DefaultQueryPhaseSearcher without using listeners.

Fixes #17593
Resolves performance regression from #18913

atris · 2025-08-08T18:39:44Z

@owaiskazi19 @andrross Please review

github-actions · 2025-08-08T18:53:25Z

❌ Gradle check result for ad942ca: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

This commit reintroduces the QueryPhaseListener functionality that was reverted in opensearch-project#18913 due to performance regressions, now with optimizations that eliminate the overhead. Changes: - Add QueryPhaseListener interface for plugins to hook into query phase - Add AbstractQueryPhaseSearcher implementing template pattern for listeners - Modify QueryPhaseSearcher interface to support listener registration - Add comprehensive tests for listener functionality Performance optimizations: - Fast-path in DefaultQueryPhaseSearcher bypasses template pattern when no listeners - Direct hasQueryPhaseListeners() check avoids virtual method calls - Single listener optimization avoids iterator creation - Zero overhead for the common case (no listeners registered) The original implementation caused 10-15% regression in neural-search because it forced all searches through the template pattern even without listeners. This optimized version maintains the extension functionality while eliminating overhead for plugins that extend DefaultQueryPhaseSearcher without using listeners. Fixes opensearch-project#17593 Resolves performance regression from opensearch-project#18913 Signed-off-by: Atri Sharma <atri.jiit@gmail.com>

github-actions · 2025-08-08T20:27:32Z

❌ Gradle check result for 526897c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

atris · 2025-08-09T20:10:48Z

Flaky tests #16576

github-actions · 2025-08-09T21:35:40Z

✅ Gradle check result for 526897c: SUCCESS

codecov · 2025-08-09T21:35:59Z

Codecov Report

❌ Patch coverage is 30.30303% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.83%. Comparing base (9f13e37) to head (526897c).
⚠️ Report is 259 commits behind head on main.

Files with missing lines	Patch %	Lines
...earch/search/query/AbstractQueryPhaseSearcher.java	25.00%	19 Missing and 2 partials ⚠️
...n/java/org/opensearch/search/query/QueryPhase.java	33.33%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main   #18994      +/-   ##
============================================
- Coverage     72.88%   72.83%   -0.06%     
+ Complexity    69327    69326       -1     
============================================
  Files          5643     5646       +3     
  Lines        318720   318819      +99     
  Branches      46113    46129      +16     
============================================
- Hits         232294   232198      -96     
- Misses        67595    67828     +233     
+ Partials      18831    18793      -38

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

vibrantvarun · 2025-08-11T15:50:52Z

The QueryCollectorContextSpec was introduced earlier to inject QueryCollectorContextSpecs from the plugins PR #18637 . Currently, the DefaultQueryPhaseSearcher is been extended by ConcurrentQueryPhaseSearcher. There is already a PR which is in draft to remove this HybridCollectorManager dependency creation via HybridAggregationProcessor. This will also remove HybridQueryPhaseSearcher need because we can simply reuse defaultQP and ConcurrentQP for our searches. We have seen 13% performance improvement by doing that in POC and in the final draft of the PR we expect more. opensearch-project/neural-search#1424.

If we introduce QueryPhaseListeners, then I think we might loose the integrity of the QueryPhase. Because currently opensearch supports two type of searches: Concurrent and nonconcurrent. But the concurrent search relies for topDocsCollector on DefaultQP. If we introduce before and after QP listeners then user can change results by writing a custom logic in after QPlistener. This can break logic in HybridQuery.

cc: @martin-gaievski @owaiskazi19

atris · 2025-08-11T16:42:24Z

The QueryCollectorContextSpec was introduced earlier to inject QueryCollectorContextSpecs from the plugins PR #18637 . Currently, the DefaultQueryPhaseSearcher is been extended by ConcurrentQueryPhaseSearcher. There is already a PR which is in draft to remove this HybridCollectorManager dependency creation via HybridAggregationProcessor. This will also remove HybridQueryPhaseSearcher need because we can simply reuse defaultQP and ConcurrentQP for our searches. We have seen 13% performance improvement by doing that in POC and in the final draft of the PR we expect more. opensearch-project/neural-search#1424.

If we introduce QueryPhaseListeners, then I think we might loose the integrity of the QueryPhase. Because currently opensearch supports two type of searches: Concurrent and nonconcurrent. But the concurrent search relies for topDocsCollector on DefaultQP. If we introduce before and after QP listeners then user can change results by writing a custom logic in after QPlistener. This can break logic in HybridQuery.

cc: @martin-gaievski @owaiskazi19

Thank you for raising these concerns. Let me address them:

The QueryPhaseListener interface passes SearchContext but doesn't provide methods to modify search results. The afterCollection method is called after results are already collected and processed. While a
malicious plugin could theoretically use reflection or other means to modify state, this is true for any plugin extension point.
The implementation includes fast-path optimization that completely bypasses the listener logic when no listeners are registered. This means the 13% performance improvement won't be affected unless a plugin explicitly registers listeners.
The listener hooks are placed at the outermost level of query execution, not interfering with the internal collector chain or concurrent search logic. They act more as observation points and post facto operations rather than modification points.
The primary use case is for plugins that need to track query execution metrics, implement custom caching strategies, or perform reranking or provide reranking signals - not to modify search results.

Would it help if we:

Add explicit documentation that listeners should not modify search state?
Make the interface name more explicit like PostQueryPhase to clarify intent?
Add safeguards to prevent result modification?

The goal is to provide extension points without compromising the integrity of search execution.

vibrantvarun · 2025-08-11T17:27:46Z

I will evaluate it more. But one Quick QQ: Why do we need this listeners? I mean is there any use case of it currently in implementing any functionality?

martin-gaievski · 2025-08-11T17:45:38Z

server/src/main/java/org/opensearch/search/query/QueryPhaseSearcher.java

+     * This method allows for fast-path optimization to skip listener logic entirely.
+     * @return true if there are listeners, false otherwise
+     */
+    default boolean hasQueryPhaseListeners() {


I'm not sure why we need this method, we can conclude on presence of listeners based on the size of the list of QueryPhaseListener objects. You can change

if (!hasQueryPhaseListeners()) {

to

if (Objects.notNull(queryPhaseListeners()) == false && queryPhaseListeners().isEmpty() == false)

martin-gaievski · 2025-08-11T17:50:33Z

The original implementation caused 10-15% regression in neural-search because it
forced all searches through the template pattern even without listeners. This
optimized version maintains the extension functionality while eliminating overhead
for plugins that extend DefaultQueryPhaseSearcher without using listeners.

Are you sure about this being a main factor, is there data the support this? The workload we run while checking for regressions - it has vast majority of hybrid queries.
It's good optimization overall, but for that scenario I suspect some other factors played key role.

atris · 2025-08-13T14:33:57Z

The original implementation caused 10-15% regression in neural-search because it
forced all searches through the template pattern even without listeners. This
optimized version maintains the extension functionality while eliminating overhead
for plugins that extend DefaultQueryPhaseSearcher without using listeners.

Are you sure about this being a main factor, is there data the support this? The workload we run while checking for regressions - it has vast majority of hybrid queries. It's good optimization overall, but for that scenario I suspect some other factors played key role.

Sorry for the delay.

The template pattern overhead is real but limited in scope. QueryPhaseBeforeAfterTests.java:35-60 shows measurable overhead in tight loops and fast-path eliminates virtual calls when hasQueryPhaseListeners() returns false. Since Neural-search extends DefaultQueryPhaseSearcher without listeners, hence It would benefits from this

However, for hybrid queries specifically, I agree that this could not be the singular source of regression. The objective of this PR is to minimise the reproducible lag and then revisit it once the numbers for neural-search come in with this change.

But yes, this alone won't fix all hybrid query performance issues. Would need profiling on actual hybrid workloads to identify other bottlenecks.

opensearch-trigger-bot · 2025-09-12T15:22:34Z

This PR is stalled because it has been open for 30 days with no activity.

opensearch-trigger-bot · 2025-10-14T15:23:05Z

This PR is stalled because it has been open for 30 days with no activity.

atris requested review from a team, Bukhtawar, CEHENKLE, Rishikesh1159, VachaShah, anasalkouz, andrross, ashking94, cwperks, dbwiddis, gbbafna, jed326, kotwanikunal, mch2, msfroh, owaiskazi19, reta, sachinpkale, saratvemulapalli, shwetathareja and sohami as code owners August 8, 2025 18:39

github-actions bot added enhancement Enhancement or improvement to existing feature or request Search Search query, autocomplete ...etc labels Aug 8, 2025

atris force-pushed the absolutely_new_runtime_extensions branch from ad942ca to bd4c107 Compare August 8, 2025 18:57

atris force-pushed the absolutely_new_runtime_extensions branch from bd4c107 to 526897c Compare August 8, 2025 19:10

opensearch-ci-bot mentioned this pull request Aug 8, 2025

[AUTOCUT] Gradle Check Flaky Test Report for IndexActionIT #16576

Open

atris closed this Aug 9, 2025

atris reopened this Aug 9, 2025

martin-gaievski reviewed Aug 11, 2025

View reviewed changes

martin-gaievski mentioned this pull request Sep 3, 2025

Introducing HybridQueryCollectorContextSpec to improve the hybrid search performance opensearch-project/neural-search#1534

Merged

5 tasks

opensearch-trigger-bot bot added stalled Issues that have stalled and removed stalled Issues that have stalled labels Sep 12, 2025

opensearch-trigger-bot bot added stalled Issues that have stalled and removed stalled Issues that have stalled labels Oct 14, 2025

Reintroduce QueryPhaseListener with performance optimizations #18994

Are you sure you want to change the base?

Reintroduce QueryPhaseListener with performance optimizations #18994

Conversation

atris commented Aug 8, 2025

Uh oh!

atris commented Aug 8, 2025

Uh oh!

github-actions bot commented Aug 8, 2025

Uh oh!

github-actions bot commented Aug 8, 2025

Uh oh!

atris commented Aug 9, 2025

Uh oh!

github-actions bot commented Aug 9, 2025

Uh oh!

codecov bot commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

vibrantvarun commented Aug 11, 2025

Uh oh!

atris commented Aug 11, 2025

Uh oh!

vibrantvarun commented Aug 11, 2025

Uh oh!

martin-gaievski Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martin-gaievski commented Aug 11, 2025

Uh oh!

atris commented Aug 13, 2025

Uh oh!

opensearch-trigger-bot bot commented Sep 12, 2025

Uh oh!

opensearch-trigger-bot bot commented Oct 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Aug 9, 2025 •

edited

Loading

martin-gaievski Aug 11, 2025 •

edited

Loading