Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable shard/segment level search_after short cutting if track_total_hits != false #9683

Merged
merged 2 commits into from
Sep 2, 2023

Conversation

gashutos
Copy link
Contributor

@gashutos gashutos commented Sep 1, 2023

Description

Context

Any _search query return hits.total.value in response like mentioned in below example. Which specifies how many hits are matching actual query (sort/search_after are not part of query).

i.e, check response for below query

GET logs-*/_search
{
        "query": {
          "match_all": {}
        },
        "sort" : [
          {"@timestamp" : "asc"}
        ],
        "search_after": ["1998-06-01"]
}

Response [partial]

{
  "took": 25,
  "timed_out": false,
  "_shards": {
    "total": 35,
    "successful": 35,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 10000,
      "relation": "gte"
    },
    "max_score": null,
    "hits": [
      {
        "_ind

This hits.total.value is capped at 10000 max as lucene can match at max 10000 documents in results.

If you want exact match in hits.total.value, we need to add track_total_hits=true in request parameter.
i.e.

GET logs-*/_search?track_total_hits=true
{
        "query": {
          "match_all": {}
        },
        "sort" : [
          {"@timestamp" : "asc"}
        ],
        "search_after": ["1998-06-01"]
}

Response for above query [partial]

{
  "took": 32,
  "timed_out": false,
  "_shards": {
    "total": 35,
    "successful": 35,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 247249096,
      "relation": "eq"
    },
    "max_score": null,
    "hits": [

If you check, it has exact matching value 247249096 in hits.total.value.

Above calculation results on additional computational costs on _search query to track total matching hits. So for performance sensitive queries, it is recommended to provide track_total_hits=falsle.
i.e

GET logs-*/_search?track_total_hits=false
{
        "query": {
          "match_all": {}
        },
        "sort" : [
          {"@timestamp" : "asc"}
        ],
        "search_after": ["1998-06-01"]
}

Response from above search query [partial]

{
  "took": 14,
  "timed_out": false,
  "_shards": {
    "total": 35,
    "successful": 35,
    "skipped": 25,
    "failed": 0
  },
  "hits": {
    "max_score": null,

Default behaviour for _search queries if track_total_hits is not provided as request input, is track matching hits up to 10000 documents as mentioned above.

Issue

Now with introduction of shard/segment level shortcutting [#6596] for search_after queries, it will skip issuing _search request for those shards/segments. So we will end up NOT tracking those documents hits which are matching query for that skipped shards/segmetns because of search_after based skipping.
This was caught in this issue. #9013.

Related Issues

Resolves #9013

Resolution in this PR

Use search_after shortcutting only if track_total_hits=false provided.

Additional note

There is no latency impact on _search queries even if we dont use shard/segment level skipping logic, now that we have this optimization available in Lucene itself with apache/lucene#12334.
It will end up issuing network call for search request for that shard but lucene query will return immediately since no matching documents and there wont be any fetch phase.
yeah I agree it will end up issuing additional network calls from coordinator to shard replica hosted data nodes.

I tested with 3 nodes cluster http_logs with 5 indices with 5 shards each. Latencies are almost similar. So no side -effect on performance with disabling this shortcutting if track_total_hits is not false.
This is P50, units are in ms.
No difference for P90, p99 as well.

Data Node Type. Sort Mode With track_total_hits != false track_total_hits = false
r6g.2xlarge asc with after 41 39
r6g.2xlarge desc with after 103 101

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@gashutos
Copy link
Contributor Author

gashutos commented Sep 1, 2023

@dblock @backslasht @reta, when you get chance.
Hopefuly last bug of search_after.

I dont understand why hits.total.value is not considering search_after. I raised the PR to keep behaviour same, but it does not make sense to return matching hits same in all paginated responses.
Also it is capped @10000.

Anyway, lets make behaviour same as it used to be before, as we dont take any latency hit after lucene fix as mentioned

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Compatibility status:

Checks if related components are compatible with change dd6967b

Incompatible components

Incompatible components: [https://github.com/opensearch-project/cross-cluster-replication.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Compatibility status:

Checks if related components are compatible with change c127758

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Compatibility status:

Checks if related components are compatible with change 1f16245

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Gradle Check (Jenkins) Run Completed with:

…_hits != false

Signed-off-by: gashutos <gashutos@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Compatibility status:

Checks if related components are compatible with change 21a3679

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/reporting.git]

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Compatibility status:

Checks if related components are compatible with change eb0a9d5

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2023

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.indices.replication.SegmentReplicationRelocationIT.testPrimaryRelocation

Copy link
Contributor

@backslasht backslasht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could try to optimize for track_total_hits set to 1000 (default value) as well. But, that is a topic for separate PR. This is good to go.

@reta reta merged commit e98ded6 into opensearch-project:main Sep 2, 2023
@reta reta added the backport 2.x Backport to 2.x branch label Sep 2, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-9683-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e98ded62d4b672a03acb4beb02bd047d7d4e1c69
# Push it to GitHub
git push --set-upstream origin backport/backport-9683-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-9683-to-2.x.

@reta
Copy link
Collaborator

reta commented Sep 2, 2023

@gashutos could you please do manual backport to 2.x? than you

gashutos added a commit to gashutos/OpenSearch that referenced this pull request Sep 3, 2023
…_hits != false (opensearch-project#9683)

Signed-off-by: gashutos <gashutos@amazon.com>
Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
gashutos added a commit to gashutos/OpenSearch that referenced this pull request Sep 3, 2023
…_hits != false (opensearch-project#9683)

Signed-off-by: gashutos <gashutos@amazon.com>
Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
@gashutos
Copy link
Contributor Author

gashutos commented Sep 3, 2023

Thanks for reviews !
@backslasht I opened feature requst #9717
If we can honor search_after while tracking total hits, it will be usuful for long paginated searches.

sachinpkale pushed a commit that referenced this pull request Sep 5, 2023
…_hits != false (#9683) (#9716)

Signed-off-by: gashutos <gashutos@amazon.com>
Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
kaushalmahi12 pushed a commit to kaushalmahi12/OpenSearch that referenced this pull request Sep 12, 2023
…_hits != false (opensearch-project#9683)

Signed-off-by: gashutos <gashutos@amazon.com>
Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
Signed-off-by: Kaushal Kumar <ravi.kaushal97@gmail.com>
brusic pushed a commit to brusic/OpenSearch that referenced this pull request Sep 25, 2023
…_hits != false (opensearch-project#9683)

Signed-off-by: gashutos <gashutos@amazon.com>
Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
Signed-off-by: Ivan Brusic <ivan.brusic@flocksafety.com>
&& minMax != null
&& primarySortField != null
&& primarySortField.missing() == null
&& Objects.equals(trackTotalHitsUpto, SearchContext.TRACK_TOTAL_HITS_DISABLED)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gashutos
Hello, I have a question. In these two scenarios:trackTotalHitsUpto=null and trackTotalHitsUpto < DEFAULT_TRACK_TOTAL_HITS_UP_TO,Is it possible to go through the optimization of shard/segment level shortcutting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trackTotalHitsUpto=null
This should trigger optimization.
trackTotalHitsUpto < DEFAULT_TRACK_TOTAL_HITS_UP_TO
This should not.

However, We have made a fix in Lucene itself for this optimizations.
apache/lucene#12334

With this lucene optimization, even if we have this OpenSearch shortcutting disabled, it wont spend much time in actual lucene logic and it would be up par with shortcutting.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gashutos Tks,there may be another problem for query with "search_after": [null, 1000000]; searchAfterPrimary equal null, minMax.compareMin will throw NPE, this behaviour will not same as before

example:

curl -XGET "127.0.0.1:9200/test/_search?pretty&track_total_hits=false" -H'Content-Type:application/json' -d'
{              
   "query": {                                
     "match_all": {}
   },                                        
   "sort":[{"name":{"order":"asc"}}, {"Id":{"order":"desc"}}],
   "search_after": [null, 1000000]
 }'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "null_pointer_exception",
        "reason" : null
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "can_match",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 112820878,
        "index" : "search_qingzhi",
        "node" : "127.0.0.1,51651,1711676797164",
        "reason" : {
          "type" : "null_pointer_exception",
          "reason" : null
        }
      }
    ],
    "caused_by" : {
      "type" : "null_pointer_exception",
      "reason" : null,
      "caused_by" : {
        "type" : "null_pointer_exception",
        "reason" : null
      }
    }
  },
  "status" : 500
}

shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…_hits != false (opensearch-project#9683)

Signed-off-by: gashutos <gashutos@amazon.com>
Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed bug Something isn't working Indexing & Search v2.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Total hits value behavior change in the response when using search_after in paginated search requests
4 participants