-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable shard/segment level search_after short cutting if track_total_hits != false #9683
Conversation
@dblock @backslasht @reta, when you get chance. I dont understand why Anyway, lets make behaviour same as it used to be before, as we dont take any latency hit after lucene fix as mentioned |
Gradle Check (Jenkins) Run Completed with:
|
Compatibility status:Checks if related components are compatible with change dd6967b Incompatible componentsIncompatible components: [https://github.com/opensearch-project/cross-cluster-replication.git] Skipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git] |
Gradle Check (Jenkins) Run Completed with:
|
Compatibility status:Checks if related components are compatible with change c127758 Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git] |
Compatibility status:Checks if related components are compatible with change 1f16245 Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git] |
Gradle Check (Jenkins) Run Completed with:
|
…_hits != false Signed-off-by: gashutos <gashutos@amazon.com>
Compatibility status:Checks if related components are compatible with change 21a3679 Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/reporting.git] |
Gradle Check (Jenkins) Run Completed with:
|
Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
Compatibility status:Checks if related components are compatible with change eb0a9d5 Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git] |
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could try to optimize for track_total_hits
set to 1000 (default value) as well. But, that is a topic for separate PR. This is good to go.
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-9683-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e98ded62d4b672a03acb4beb02bd047d7d4e1c69
# Push it to GitHub
git push --set-upstream origin backport/backport-9683-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x Then, create a pull request where the |
@gashutos could you please do manual backport to |
…_hits != false (opensearch-project#9683) Signed-off-by: gashutos <gashutos@amazon.com> Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
…_hits != false (opensearch-project#9683) Signed-off-by: gashutos <gashutos@amazon.com> Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
Thanks for reviews ! |
…_hits != false (opensearch-project#9683) Signed-off-by: gashutos <gashutos@amazon.com> Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com> Signed-off-by: Kaushal Kumar <ravi.kaushal97@gmail.com>
…_hits != false (opensearch-project#9683) Signed-off-by: gashutos <gashutos@amazon.com> Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com> Signed-off-by: Ivan Brusic <ivan.brusic@flocksafety.com>
&& minMax != null | ||
&& primarySortField != null | ||
&& primarySortField.missing() == null | ||
&& Objects.equals(trackTotalHitsUpto, SearchContext.TRACK_TOTAL_HITS_DISABLED)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gashutos
Hello, I have a question. In these two scenarios:trackTotalHitsUpto=null and trackTotalHitsUpto < DEFAULT_TRACK_TOTAL_HITS_UP_TO,Is it possible to go through the optimization of shard/segment level shortcutting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trackTotalHitsUpto=null
This should trigger optimization.
trackTotalHitsUpto < DEFAULT_TRACK_TOTAL_HITS_UP_TO
This should not.
However, We have made a fix in Lucene itself for this optimizations.
apache/lucene#12334
With this lucene optimization, even if we have this OpenSearch shortcutting disabled, it wont spend much time in actual lucene logic and it would be up par with shortcutting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gashutos Tks,there may be another problem for query with "search_after": [null, 1000000]
; searchAfterPrimary equal null, minMax.compareMin will throw NPE, this behaviour will not same as before
example:
curl -XGET "127.0.0.1:9200/test/_search?pretty&track_total_hits=false" -H'Content-Type:application/json' -d'
{
"query": {
"match_all": {}
},
"sort":[{"name":{"order":"asc"}}, {"Id":{"order":"desc"}}],
"search_after": [null, 1000000]
}'
{
"error" : {
"root_cause" : [
{
"type" : "null_pointer_exception",
"reason" : null
}
],
"type" : "search_phase_execution_exception",
"reason" : "all shards failed",
"phase" : "can_match",
"grouped" : true,
"failed_shards" : [
{
"shard" : 112820878,
"index" : "search_qingzhi",
"node" : "127.0.0.1,51651,1711676797164",
"reason" : {
"type" : "null_pointer_exception",
"reason" : null
}
}
],
"caused_by" : {
"type" : "null_pointer_exception",
"reason" : null,
"caused_by" : {
"type" : "null_pointer_exception",
"reason" : null
}
}
},
"status" : 500
}
…_hits != false (opensearch-project#9683) Signed-off-by: gashutos <gashutos@amazon.com> Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com> Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Description
Context
Any
_search
query returnhits.total.value
in response like mentioned in below example. Which specifies how many hits are matching actual query (sort/search_after are not part of query).i.e, check response for below query
Response [partial]
This
hits.total.value
is capped at10000
max as lucene can match at max 10000 documents in results.If you want exact match in
hits.total.value
, we need to addtrack_total_hits=true
in request parameter.i.e.
Response for above query [partial]
If you check, it has exact matching value
247249096
inhits.total.value
.Above calculation results on additional computational costs on
_search
query to track total matching hits. So for performance sensitive queries, it is recommended to providetrack_total_hits=falsle
.i.e
Response from above search query [partial]
Default behaviour for _search queries if
track_total_hits
is not provided as request input, is track matching hits up to 10000 documents as mentioned above.Issue
Now with introduction of shard/segment level shortcutting [#6596] for
search_after
queries, it will skip issuing _search request for those shards/segments. So we will end up NOT tracking those documents hits which are matching query for that skipped shards/segmetns because ofsearch_after
based skipping.This was caught in this issue. #9013.
Related Issues
Resolves #9013
Resolution in this PR
Use
search_after
shortcutting only iftrack_total_hits=false
provided.Additional note
There is no latency impact on _search queries even if we dont use shard/segment level skipping logic, now that we have this optimization available in Lucene itself with apache/lucene#12334.
It will end up issuing network call for search request for that shard but lucene query will return immediately since no matching documents and there wont be any fetch phase.
yeah I agree it will end up issuing additional network calls from coordinator to shard replica hosted data nodes.
I tested with 3 nodes cluster
http_logs
with 5 indices with 5 shards each. Latencies are almost similar. So no side -effect on performance with disabling this shortcutting if track_total_hits is not false.This is P50, units are in ms.
No difference for P90, p99 as well.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.