Skip to content

Add sort optimization with after from Lucene #64292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Sep 13, 2021

Conversation

mayya-sharipova
Copy link
Contributor

@mayya-sharipova mayya-sharipova commented Oct 28, 2020

Lucene 8.7 introduces numeric sort optimization directly in comparators.
This means we don't need to have it in ES.
This removes the sort optimization code in ES, and
the only thing that is needed is setCanUsePoints on sortField.

This also will introduce sort optimization with search_after,
as Lucene directly supports sort optimization with search_after.

As previously, we enable sort optimization only when there is
Long sort on Long or Date fields.

There could be a regression on desc sort, for example on
@timestamp desc field. In this case, we suggest to
convert these indices to data stream indices, as
datatastream indices have their desc sort on
@timestamp field optimized

@mayya-sharipova mayya-sharipova added the :Search/Search Search-related issues that do not fall into other categories label Oct 28, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@elasticmachine elasticmachine added the Team:Search Meta label for search team label Oct 28, 2020
Lucene 8.7 introduces numeric sort optimization directly in comparators.
This means we don't need to have it in ES.
This removes the sort optimization code in ES, and
the only thing that is needed is setCanUsePoints on sortField.

This also will introduce sort optimization with search_after,
as Lucene directly suports sort optimization with search_after.
@mayya-sharipova mayya-sharipova changed the title Remove sort optimization to Lucene Add sort optimization with after from Lucene Oct 29, 2020
@mayya-sharipova
Copy link
Contributor Author

mayya-sharipova commented Oct 30, 2020

Rally benchmarking results for geonames (number of docs: 11M, many segments):

baseline: master branch; contender: this PR

Overall very good results:

sort:

  • slightly improved fast performance for desc_sort_geonameid
  • slightly improved fast performance for asc_sort_geonameid
  • 10x speedups for desc_sort_population
  • 10x speedups for asc_sort_population
    Previously we did not apply sort optim to duplicate data, which is the field population in this index, where most docs contain a value of 0. The sort optimization for this PR applies sort optim in the case of duplicate data as well.

sort with after:
after value for population field: 1000000
after value for geonameid field is a medium value of 5000000

  • 10x speedups for asc_sort_with_after_population
  • 3x speedups for desc_sort_with_after_geonameid
  • 7x speedups for asc_sort_with_after_geonameid
|                                                        Metric |                           Task |    Baseline |   Contender |     Diff |   Unit |
|--------------------------------------------------------------:|-------------------------------:|------------:|------------:|---------:|-------:|
|                                  50th percentile service time |           desc_sort_population |     52.4295 |     4.67878 | -47.7508 |     ms |
|                                  90th percentile service time |           desc_sort_population |     53.9994 |     4.79027 | -49.2092 |     ms |
|                                  99th percentile service time |           desc_sort_population |     54.9973 |     4.98468 | -50.0127 |     ms |
|                                 100th percentile service time |           desc_sort_population |     55.0597 |      5.2847 |  -49.775 |     ms |

|                                  50th percentile service time |            asc_sort_population |     53.3332 |     3.91246 | -49.4207 |     ms |
|                                  90th percentile service time |            asc_sort_population |     55.0845 |     4.00362 | -51.0809 |     ms |
|                                  99th percentile service time |            asc_sort_population |     74.9326 |     5.11153 |  -69.821 |     ms |
|                                 100th percentile service time |            asc_sort_population |     78.6209 |     9.93631 | -68.6846 |     ms |

|                                  50th percentile service time | asc_sort_with_after_population |     76.1371 |     7.88222 | -68.2549 |     ms |
|                                  90th percentile service time | asc_sort_with_after_population |     80.8901 |     8.05058 | -72.8395 |     ms |
|                                  99th percentile service time | asc_sort_with_after_population |     97.8003 |     8.35941 | -89.4409 |     ms |
|                                 100th percentile service time | asc_sort_with_after_population |     104.533 |     8.39089 | -96.1425 |     ms |

|                                  50th percentile service time |            desc_sort_geonameid |     8.13857 |     5.62399 | -2.51459 |     ms |
|                                  90th percentile service time |            desc_sort_geonameid |     10.3497 |     5.72067 | -4.62899 |     ms |
|                                  99th percentile service time |            desc_sort_geonameid |     13.5153 |     7.49825 | -6.01707 |     ms |
|                                 100th percentile service time |            desc_sort_geonameid |     19.6222 |     7.61355 | -12.0086 |     ms |

|                                  50th percentile service time | desc_sort_with_after_geonameid |     67.1275 |     20.2335 |  -46.894 |     ms |
|                                  90th percentile service time | desc_sort_with_after_geonameid |     68.6301 |     20.8645 | -47.7656 |     ms |
|                                  99th percentile service time | desc_sort_with_after_geonameid |      69.694 |     23.5649 | -46.1291 |     ms |
|                                 100th percentile service time | desc_sort_with_after_geonameid |     69.9783 |      25.656 | -44.3223 |     ms |
                                            
|                                  50th percentile service time |             asc_sort_geonameid |     6.21184 |      4.4267 | -1.78514 |     ms |
|                                  90th percentile service time |             asc_sort_geonameid |     6.35723 |     4.49574 | -1.86149 |     ms |
|                                  99th percentile service time |             asc_sort_geonameid |     6.62406 |     5.30552 | -1.31854 |     ms |
|                                 100th percentile service time |             asc_sort_geonameid |     6.85049 |     10.6745 |  3.82397 |     ms |
|                                                    
|                                  50th percentile service time |  asc_sort_with_after_geonameid |     62.5055 |     8.91216 | -53.5934 |     ms |
|                                  90th percentile service time |  asc_sort_with_after_geonameid |     64.3278 |     9.06383 | -55.2639 |     ms |
|                                  99th percentile service time |  asc_sort_with_after_geonameid |     67.5103 |     9.42481 | -58.0855 |     ms |
|                                 100th percentile service time |  asc_sort_with_after_geonameid |     67.6121 |     10.0498 | -57.5624 |     ms |

@mayya-sharipova
Copy link
Contributor Author

mayya-sharipova commented Oct 30, 2020

http_logs results with bulk_indexing_clients:1


Multiple segments:

sort:

  • 10x regression for desc_sort_timestamp for multiple segments. This needs more investigation, but the main reason could be that before we would first sort leaves, search with a shared CollectorManager, where collectors could exchange the current shared min score. Currently, as collectors don't have an ability to share the current min FieldDoc, we don't do sorting of leaves, and start from the 1st leaf (with the smallest timestamp).
  • slightly improved fast performance for asc_sort_timestamp

sort with after:
after value is set to "1998-06-10", which is medium value

  • 2x speedups for desc_sort_with_after_timestamp
  • 2x speedups for asc_sort_with_after_timestamp

Single segment:
sort:

  • 1.3x speedups for desc-sort-timestamp-after-force-merge-1-seg
  • slightly improved fast performance for asc-sort-timestamp-after-force-merge-1-seg

sort with after:

  • 2x speedups for desc-sort-with-after-timestamp-after-force-merge-1-seg
  • 1.5x speedups for asc-sort-with-after-timestamp-after-force-merge-1-seg
|                                                        Metric |                                                   Task |    Baseline |   Contender |     Diff |    Unit |
|--------------------------------------------------------------:|-------------------------------------------------------:|------------:|------------:|---------:|--------:|
|                                  50th percentile service time |                                    desc_sort_timestamp |     79.2801 |     728.161 |   648.88 |      ms |
|                                  90th percentile service time |                                    desc_sort_timestamp |     80.9543 |     736.429 |  655.475 |      ms |
|                                  99th percentile service time |                                    desc_sort_timestamp |     82.3793 |     746.271 |  663.892 |      ms |
|                                 100th percentile service time |                                    desc_sort_timestamp |      82.682 |     750.077 |  667.395 |      ms |

|                                  50th percentile service time |                                     asc_sort_timestamp |     4.03299 |     3.17632 | -0.85667 |      ms |
|                                  90th percentile service time |                                     asc_sort_timestamp |     4.10074 |     3.23543 | -0.86532 |      ms |
|                                  99th percentile service time |                                     asc_sort_timestamp |     4.17642 |     3.41485 | -0.76157 |      ms |
|                                 100th percentile service time |                                     asc_sort_timestamp |     4.20957 |     3.49393 | -0.71565 |      ms |
|                                                   
|                                  50th percentile service time |                         desc_sort_with_after_timestamp |     876.889 |     463.023 | -413.866 |      ms |
|                                  90th percentile service time |                         desc_sort_with_after_timestamp |     889.537 |     486.986 | -402.552 |      ms |
|                                  99th percentile service time |                         desc_sort_with_after_timestamp |     898.211 |     500.279 | -397.932 |      ms |
|                                 100th percentile service time |                         desc_sort_with_after_timestamp |     932.299 |     512.597 | -419.703 |      ms |
|                                                   
|                                  50th percentile service time |                          asc_sort_with_after_timestamp |     895.285 |     479.242 | -416.043 |      ms |
|                                  90th percentile service time |                          asc_sort_with_after_timestamp |     905.272 |     484.133 | -421.139 |      ms |
|                                  99th percentile service time |                          asc_sort_with_after_timestamp |     926.605 |      495.05 | -431.554 |      ms |
|                                 100th percentile service time |                          asc_sort_with_after_timestamp |     959.878 |     497.233 | -462.645 |      ms |
|                                                   
|                                  50th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     1489.91 |     1146.95 | -342.962 |      ms |
|                                  90th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     1515.51 |     1188.47 | -327.038 |      ms |
|                                  99th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     1548.66 |     1413.39 |  -135.27 |      ms |
|                                 100th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     1560.97 |     1520.52 |  -40.449 |      ms |
|                                                                                 
|                                  50th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     3.77088 |     2.78592 | -0.98496 |      ms |
|                                  90th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     3.87459 |     2.84127 | -1.03332 |      ms |
|                                  99th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     3.99381 |     3.06374 | -0.93007 |      ms |
|                                 100th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |        4.02 |      3.1414 |  -0.8786 |      ms |
|                                                   
|                                  50th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     917.005 |     549.546 | -367.459 |      ms |
|                                  90th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     957.561 |     563.502 | -394.059 |      ms |
|                                  99th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     997.819 |     590.614 | -407.205 |      ms |
|                                 100th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     998.564 |     664.574 |  -333.99 |      ms |
|                                                   
|                                  50th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     929.596 |     637.683 | -291.913 |      ms |
|                                  90th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |      962.94 |     655.426 | -307.513 |      ms |
|                                  99th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     981.428 |     675.017 | -306.411 |      ms |
|                                 100th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     1019.66 |     676.265 | -343.397 |      ms |

@mayya-sharipova
Copy link
Contributor Author

mayya-sharipova commented Dec 1, 2020

Here are the benchmarks for sorted-reader:

baseline: master branch; contender: sorted-reader branch

geonames:

Overall very good results:

sort:

  • slightly improved fast performance for asc_sort_geonameid
  • slightly improved fast performance for desc_sort_geonameid
  • 18x speedups for asc_sort_population
  • 10x speedups for desc_sort_population
    Previously we did not apply sort optim to duplicate data, which is the field population in this index, where most docs contain a value of 0. The sort optimization for this PR applies sort optim in the case of duplicate data as well.

sort with after:
after value for population field: 1000000
after value for geonameid field is a medium value of 5000000

  • 2.5x speedups for desc_sort_with_after_geonameid
  • 7x speedups for asc_sort_with_after_geonameid
  • 10x speedups for asc_sort_with_after_population
|                                                        Metric |                           Task |    Baseline |   Contender |     Diff |   Unit |
|--------------------------------------------------------------:|-------------------------------:|------------:|------------:|---------:|-------:|
|                                  50th percentile service time |             asc_sort_geonameid |     6.21184 |     6.35076 |  0.13892 |     ms |
|                                  90th percentile service time |             asc_sort_geonameid |     6.35723 |     6.71775 |  0.36052 |     ms |
|                                  99th percentile service time |             asc_sort_geonameid |     6.62406 |     6.85819 |  0.23413 |     ms |
|                                 100th percentile service time |             asc_sort_geonameid |     6.85049 |     6.94722 |  0.09673 |     ms |

|                                  50th percentile service time |            desc_sort_geonameid |     8.13857 |     7.19367 | -0.94491 |     ms |
|                                  90th percentile service time |            desc_sort_geonameid |     10.3497 |     7.46943 | -2.88024 |     ms |
|                                  99th percentile service time |            desc_sort_geonameid |     13.5153 |     7.68365 | -5.83167 |     ms |
|                                 100th percentile service time |            desc_sort_geonameid |     19.6222 |     7.71474 | -11.9074 |     ms |

|                                  50th percentile service time |            asc_sort_population |     53.3332 |     3.22252 | -50.1106 |     ms |
|                                  90th percentile service time |            asc_sort_population |     55.0845 |     3.30176 | -51.7828 |     ms |
|                                  99th percentile service time |            asc_sort_population |     74.9326 |     3.41648 | -71.5161 |     ms |
|                                 100th percentile service time |            asc_sort_population |     78.6209 |     3.54624 | -75.0747 |     ms |

|                                  50th percentile service time |           desc_sort_population |     52.4295 |     4.69856 |  -47.731 |     ms |
|                                  90th percentile service time |           desc_sort_population |     53.9994 |     4.86827 | -49.1312 |     ms |
|                                  99th percentile service time |           desc_sort_population |     54.9973 |     5.20856 | -49.7888 |     ms |
|                                 100th percentile service time |           desc_sort_population |     55.0597 |      5.7289 | -49.3308 |     ms |
                                                 
|                                  50th percentile service time | desc_sort_with_after_geonameid |     67.1275 |     25.3947 | -41.7328 |     ms |
|                                  90th percentile service time | desc_sort_with_after_geonameid |     68.6301 |     26.9443 | -41.6858 |     ms |
|                                  99th percentile service time | desc_sort_with_after_geonameid |      69.694 |     28.2522 | -41.4417 |     ms |
|                                 100th percentile service time | desc_sort_with_after_geonameid |     69.9783 |     30.6242 | -39.3541 |     ms |

|                                  50th percentile service time |  asc_sort_with_after_geonameid |     62.5055 |     8.86973 | -53.6358 |     ms |
|                                  90th percentile service time |  asc_sort_with_after_geonameid |     64.3278 |     9.52665 | -54.8011 |     ms |
|                                  99th percentile service time |  asc_sort_with_after_geonameid |     67.5103 |     10.8724 | -56.6379 |     ms |
|                                 100th percentile service time |  asc_sort_with_after_geonameid |     67.6121 |     17.2646 | -50.3475 |     ms |

|                                  50th percentile service time | asc_sort_with_after_population |     76.1371 |     6.98233 | -69.1548 |     ms |
|                                  90th percentile service time | asc_sort_with_after_population |     80.8901 |     7.23916 | -73.6509 |     ms |
|                                  99th percentile service time | asc_sort_with_after_population |     97.8003 |     7.34614 | -90.4541 |     ms |
|                                 100th percentile service time | asc_sort_with_after_population |     104.533 |     7.34691 | -97.1865 |     ms |

http_logs:

Multiple segments:

sort:

  • slight degradation of scroll performance (around 6%)
  • 1.4x speedups for desc_sort_timestamp
  • slightly improved fast performance for asc_sort_timestamp

sort with after:
after value is set to "1998-06-10", which is medium value

  • 1.5x speedups for desc_sort_with_after_timestamp
  • 1.7x speedups for asc_sort_with_after_timestamp

Single segment:

sort:

  • 1.4x speedups for desc-sort-timestamp-after-force-merge-1-seg
  • slightly improved fast performance for asc-sort-timestamp-after-force-merge-1-seg

sort with after:

  • 1.4x speedups for desc-sort-with-after-timestamp-after-force-merge-1-seg
  • 1.13x speedups for asc-sort-with-after-timestamp-after-force-merge-1-seg
|                                                        Metric |                                                   Task |    Baseline |   Contender |     Diff |    Unit |
|--------------------------------------------------------------:|-------------------------------------------------------:|------------:|------------:|---------:|--------:|
|                                  50th percentile service time |                                                 scroll |     179.309 |     191.205 |  11.8963 |      ms |
|                                  90th percentile service time |                                                 scroll |     182.411 |     194.576 |   12.165 |      ms |
|                                  99th percentile service time |                                                 scroll |     183.547 |     198.669 |   15.122 |      ms |
|                                 100th percentile service time |                                                 scroll |     183.907 |     221.174 |  37.2673 |      ms |

|                                  50th percentile service time |                                    desc_sort_timestamp |     79.2801 |     57.9189 | -21.3612 |      ms |
|                                  90th percentile service time |                                    desc_sort_timestamp |     80.9543 |     59.9991 | -20.9552 |      ms |
|                                  99th percentile service time |                                    desc_sort_timestamp |     82.3793 |     61.7205 | -20.6588 |      ms |
|                                 100th percentile service time |                                    desc_sort_timestamp |      82.682 |     64.8586 | -17.8234 |      ms |

|                                  50th percentile service time |                                     asc_sort_timestamp |     4.03299 |     3.10877 | -0.92422 |      ms |
|                                  90th percentile service time |                                     asc_sort_timestamp |     4.10074 |     3.27901 | -0.82173 |      ms |
|                                  99th percentile service time |                                     asc_sort_timestamp |     4.17642 |     3.52329 | -0.65313 |      ms |
|                                 100th percentile service time |                                     asc_sort_timestamp |     4.20957 |     3.57917 |  -0.6304 |      ms |

|                                  50th percentile service time |                         desc_sort_with_after_timestamp |     876.889 |     553.521 | -323.367 |      ms |
|                                  90th percentile service time |                         desc_sort_with_after_timestamp |     889.537 |     585.901 | -303.636 |      ms |
|                                  99th percentile service time |                         desc_sort_with_after_timestamp |     898.211 |     624.584 | -273.627 |      ms |
|                                 100th percentile service time |                         desc_sort_with_after_timestamp |     932.299 |     632.524 | -299.775 |      ms |

|                                  50th percentile service time |                          asc_sort_with_after_timestamp |     895.285 |     465.738 | -429.547 |      ms |
|                                  90th percentile service time |                          asc_sort_with_after_timestamp |     905.272 |      485.32 | -419.952 |      ms |
|                                  99th percentile service time |                          asc_sort_with_after_timestamp |     926.605 |     558.897 | -367.707 |      ms |
|                                 100th percentile service time |                          asc_sort_with_after_timestamp |     959.878 |     561.226 | -398.652 |      ms |

|                                  50th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     1489.91 |     1031.37 |  -458.54 |      ms |
|                                  90th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     1515.51 |     1057.32 | -458.187 |      ms |
|                                  99th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     1548.66 |      1114.3 | -434.355 |      ms |
|                                 100th percentile service time |            desc-sort-timestamp-after-force-merge-1-seg |     1560.97 |     1223.74 | -337.225 |      ms |
|                                                    
|                                  50th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     3.77088 |      2.7473 | -1.02358 |      ms |
|                                  90th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     3.87459 |     2.82991 | -1.04468 |      ms |
|                                  99th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |     3.99381 |     2.89647 | -1.09734 |      ms |
|                                 100th percentile service time |             asc-sort-timestamp-after-force-merge-1-seg |        4.02 |     2.90383 | -1.11617 |      ms |
|                                                   
|                                  50th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     917.005 |     619.416 | -297.589 |      ms |
|                                  90th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     957.561 |     660.589 | -296.972 |      ms |
|                                  99th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     997.819 |     696.835 | -300.984 |      ms |
|                                 100th percentile service time | desc-sort-with-after-timestamp-after-force-merge-1-seg |     998.564 |     709.186 | -289.378 |      ms |
|                                                   
|                                  50th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     929.596 |       765.3 | -164.296 |      ms |
|                                  90th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |      962.94 |     802.005 | -160.935 |      ms |
|                                  99th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     981.428 |     870.958 |  -110.47 |      ms |
|                                 100th percentile service time |  asc-sort-with-after-timestamp-after-force-merge-1-seg |     1019.66 |     909.475 | -110.187 |      ms |

@jimczi It looks like your approach with a sorted MultiReader worked well, and we have speedups both for desc and asc sort. We can go ahead with this sorted_reader then.

@jimczi
Copy link
Contributor

jimczi commented Apr 23, 2021

We took another approach for this, hence closing.

@jimczi jimczi closed this Apr 23, 2021
@YohanSciubukgian
Copy link

@jimczi Do you have any other issue/PR to follow improvements related to this topic ?

@jakelandis jakelandis removed the v8.0.0 label Jul 26, 2021
@mayya-sharipova
Copy link
Contributor Author

mayya-sharipova commented Sep 8, 2021

@jimczi I am reviving this PR with merging the latest master here, can you please continue the review when you have time.

Here are the latest benchmarks for http_logs dataset, provided that it is a data stream.
For non data streams, we expect regression on desc_sort_timestamp for multiple segments.

Here are the results of the 99th percentile service time.
Baseline: master branch
Contender: this PR

Surprisingly, this time I also see 17% regression on desc_sort_with_after_timestamp. I will investigate what could be a reason for it.

+|                                                   Task |    Baseline |   Contender |     Diff |   Unit |
+|                                    desc_sort_timestamp |     79.6757 |     66.9085 | -12.7672 |     ms |
+|                                     asc_sort_timestamp |     4.51976 |     3.58597 | -0.93378 |     ms |
-|                         desc_sort_with_after_timestamp |     748.949 |     878.057 |  129.108 |     ms |
+|                          asc_sort_with_after_timestamp |     754.128 |     459.698 |  -294.43 |     ms |
+|            desc-sort-timestamp-after-force-merge-1-seg |     1485.43 |     985.803 | -499.632 |     ms |
+|             asc-sort-timestamp-after-force-merge-1-seg |     4.17577 |     3.23124 | -0.94453 |     ms |
+| desc-sort-with-after-timestamp-after-force-merge-1-seg |     866.316 |     519.599 | -346.717 |     ms |
+|  asc-sort-with-after-timestamp-after-force-merge-1-seg |     790.108 |     604.906 | -185.202 |     ms |

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@mayya-sharipova mayya-sharipova merged commit 1b56e8b into elastic:master Sep 13, 2021
@mayya-sharipova mayya-sharipova deleted the sort-optim-after branch September 13, 2021 10:56
mayya-sharipova added a commit to mayya-sharipova/elasticsearch that referenced this pull request Sep 13, 2021
Lucene 8.7 introduces numeric sort optimization directly in comparators.
This means we don't need to have it in ES.
This removes the sort optimization code in ES, and
the only thing that is needed is setCanUsePoints on sortField.

This also will introduce sort optimization with search_after,
as Lucene directly supports sort optimization with search_after.

As previously, we enable sort optimization only when there is
Long sort on Long or Date fields.

There could be a regression on desc sort, for example on
@timestamp desc field. In this case, we suggest to
convert these indices to data stream indices, as
datatastream indices have their desc sort on
@timestamp field optimized.
mayya-sharipova added a commit that referenced this pull request Sep 13, 2021
Lucene 8.7 introduces numeric sort optimization directly in comparators.
This means we don't need to have it in ES.
This removes the sort optimization code in ES, and
the only thing that is needed is setCanUsePoints on sortField.

This also will introduce sort optimization with search_after,
as Lucene directly supports sort optimization with search_after.

As previously, we enable sort optimization only when there is
Long sort on Long or Date fields.

There could be a regression on desc sort, for example on
@timestamp desc field. In this case, we suggest to
convert these indices to data stream indices, as
datatastream indices have their desc sort on
@timestamp field optimized

Backport for #64292
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories v7.16.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants