Skip to content

Vectorize bitset to array #14910

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Conversation

gf2121
Copy link
Contributor

@gf2121 gf2121 commented Jul 7, 2025

This is a minimal prof to describe an idea about how to vectorize a bitset into an array, which can be a hot path when posting is encoded as a bitset. This version currently only runs on AVX512, but can be adapted to more in the future.

Benchmark                             (bitSetSize)   Mode  Cnt      Score      Error   Units
BitsetToArrayBenchmark.baseline                128  thrpt    5   5477.202 ±   36.920  ops/ms
BitsetToArrayBenchmark.baseline                256  thrpt    5   6197.595 ±   92.064  ops/ms
BitsetToArrayBenchmark.baseline                512  thrpt    5   7121.446 ±  113.840  ops/ms
BitsetToArrayBenchmark.baseline                768  thrpt    5   7361.335 ±  286.118  ops/ms
BitsetToArrayBenchmark.vectorized512           128  thrpt    5  85321.831 ± 1539.445  ops/ms
BitsetToArrayBenchmark.vectorized512           256  thrpt    5  58632.773 ± 1130.691  ops/ms
BitsetToArrayBenchmark.vectorized512           512  thrpt    5  48780.092 ±  958.403  ops/ms
BitsetToArrayBenchmark.vectorized512           768  thrpt    5  29373.799 ±  392.238  ops/ms

Copy link

github-actions bot commented Jul 7, 2025

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

@uschindler
Copy link
Contributor

This cannot be merged without adding this to the java24 part and reoving the requires of incubator module for JMH.

I assume this is only meant for quick checks and stays draft?

@gf2121
Copy link
Contributor Author

gf2121 commented Jul 7, 2025

Thanks for reminding!

I assume this is only meant for quick checks and stays draft?

Yes, after the code integrated into VectorUtil benchmark will call VectorUtil directly and remove the requirement for the incubator module, just like other benchmarks.

@gf2121
Copy link
Contributor Author

gf2121 commented Jul 9, 2025

I managed to get some luceneutil data on AVX512

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                 FilteredPrefix3       76.86      (3.2%)       76.20      (5.5%)   -0.9% (  -9% -    8%) 0.552
                         Prefix3       81.74      (3.1%)       81.13      (4.9%)   -0.7% (  -8% -    7%) 0.567
                AndMedOrHighHigh       22.83      (1.8%)       22.78      (2.0%)   -0.3% (  -3% -    3%) 0.670
                   TermMonthSort     1184.75      (2.7%)     1182.73      (8.4%)   -0.2% ( -10% -   11%) 0.931
              CombinedOrHighHigh        7.95      (2.8%)        7.94      (2.7%)   -0.2% (  -5% -    5%) 0.844
             And2Terms2StopWords       66.71      (2.5%)       66.61      (4.6%)   -0.1% (  -7% -    7%) 0.903
                          Fuzzy1       33.29      (2.5%)       33.26      (3.4%)   -0.1% (  -5% -    5%) 0.929
             FilteredOrStopWords       11.72      (2.0%)       11.71      (3.5%)   -0.1% (  -5% -    5%) 0.929
                    FilteredTerm       68.60      (1.6%)       68.56      (4.1%)   -0.0% (  -5% -    5%) 0.962
             CountFilteredIntNRQ       24.33      (1.1%)       24.33      (2.4%)   -0.0% (  -3% -    3%) 0.991
                          IntNRQ       55.49      (1.2%)       55.51      (3.1%)    0.0% (  -4% -    4%) 0.959
      FilteredOr2Terms2StopWords       59.21      (2.1%)       59.27      (4.5%)    0.1% (  -6% -    6%) 0.929
                 CountOrHighHigh       63.85      (1.0%)       63.93      (2.4%)    0.1% (  -3% -    3%) 0.827
             CountFilteredPhrase       12.29      (1.6%)       12.31      (1.9%)    0.1% (  -3% -    3%) 0.800
                      AndHighMed       72.10      (2.0%)       72.23      (4.2%)    0.2% (  -5% -    6%) 0.860
                          Phrase        9.76      (2.2%)        9.78      (3.3%)    0.2% (  -5% -    5%) 0.823
                  CountOrHighMed       93.67      (1.3%)       93.87      (2.7%)    0.2% (  -3% -    4%) 0.746
                 CountAndHighMed       90.17      (1.0%)       90.37      (2.2%)    0.2% (  -2% -    3%) 0.690
                      DismaxTerm      331.70      (3.2%)      332.55      (6.0%)    0.3% (  -8% -    9%) 0.867
               FilteredAnd3Terms      105.66      (2.0%)      105.96      (2.9%)    0.3% (  -4% -    5%) 0.717
          CountFilteredOrHighMed       32.69      (1.4%)       32.78      (2.0%)    0.3% (  -3% -    3%) 0.588
                          Fuzzy2       30.47      (2.4%)       30.57      (3.4%)    0.3% (  -5% -    6%) 0.724
              Or2Terms2StopWords       68.51      (2.3%)       68.74      (4.7%)    0.3% (  -6% -    7%) 0.765
             CombinedAndHighHigh        8.07      (2.1%)        8.10      (2.2%)    0.4% (  -3% -    4%) 0.596
                   TermTitleSort       59.34      (2.8%)       59.55      (4.0%)    0.4% (  -6% -    7%) 0.739
                        Wildcard       48.97      (3.6%)       49.16      (4.6%)    0.4% (  -7% -    8%) 0.766
                CountAndHighHigh       63.48      (1.3%)       63.74      (2.2%)    0.4% (  -3% -    3%) 0.472
                      TermDTSort      191.53      (2.0%)      192.32      (5.5%)    0.4% (  -7% -    8%) 0.752
             FilteredAndHighHigh       16.87      (1.3%)       16.94      (2.3%)    0.4% (  -3% -    4%) 0.457
            FilteredAndStopWords       13.70      (1.6%)       13.76      (2.2%)    0.4% (  -3% -    4%) 0.468
         CountFilteredOrHighHigh       27.44      (0.9%)       27.56      (1.7%)    0.4% (  -2% -    3%) 0.305
                         Respell       27.54      (2.1%)       27.66      (2.4%)    0.4% (  -3% -    4%) 0.534
                    CombinedTerm       16.58      (2.9%)       16.66      (3.0%)    0.5% (  -5% -    6%) 0.621
                       OrHighMed       87.71      (2.3%)       88.12      (4.9%)    0.5% (  -6% -    7%) 0.702
                            Term      421.54      (3.5%)      423.51      (5.9%)    0.5% (  -8% -   10%) 0.761
                  FilteredIntNRQ       54.79      (1.9%)       55.09      (2.7%)    0.5% (  -4% -    5%) 0.468
              FilteredOrHighHigh       18.14      (1.8%)       18.24      (3.5%)    0.5% (  -4% -    5%) 0.539
                 DismaxOrHighMed       57.31      (1.9%)       57.65      (5.4%)    0.6% (  -6% -    8%) 0.647
     FilteredAnd2Terms2StopWords       69.42      (1.9%)       69.84      (3.1%)    0.6% (  -4% -    5%) 0.450
               TermDayOfYearSort      317.00      (2.3%)      319.07      (3.9%)    0.7% (  -5% -    6%) 0.515
              FilteredAndHighMed       46.77      (1.5%)       47.11      (2.6%)    0.7% (  -3% -    4%) 0.270
                      OrHighRare      116.92      (4.6%)      117.89      (5.9%)    0.8% (  -9% -   11%) 0.620
                 AndHighOrMedMed       21.55      (2.1%)       21.74      (1.9%)    0.9% (  -3% -    5%) 0.172
                FilteredOr3Terms       52.71      (1.9%)       53.18      (4.2%)    0.9% (  -5% -    7%) 0.386
                  FilteredPhrase       12.77      (1.8%)       12.89      (3.2%)    0.9% (  -4% -    6%) 0.262
               FilteredOrHighMed       50.52      (2.4%)       50.99      (4.5%)    0.9% (  -5% -    8%) 0.416
               CombinedOrHighMed       28.48      (2.3%)       28.76      (4.4%)    1.0% (  -5% -    7%) 0.392
                DismaxOrHighHigh       39.77      (1.9%)       40.16      (3.4%)    1.0% (  -4% -    6%) 0.256
                       And3Terms       84.76      (2.1%)       85.63      (3.6%)    1.0% (  -4% -    6%) 0.272
                        Or3Terms       76.27      (1.3%)       77.08      (3.7%)    1.1% (  -3% -    6%) 0.226
              CombinedAndHighMed       29.04      (2.3%)       29.39      (4.1%)    1.2% (  -5% -    7%) 0.252
                        PKLookup       75.51      (1.3%)       76.44      (3.2%)    1.2% (  -3% -    5%) 0.112
                       CountTerm     2847.75      (5.6%)     2897.40      (8.8%)    1.7% ( -11% -   17%) 0.454
                      OrHighHigh       29.50      (2.0%)       30.17      (2.9%)    2.3% (  -2% -    7%) 0.004
                          IntSet      150.21      (4.2%)      154.28      (4.6%)    2.7% (  -5% -   11%) 0.051
                     AndHighHigh       30.04      (1.9%)       31.63      (3.1%)    5.3% (   0% -   10%) 0.000
                    AndStopWords       10.60      (1.9%)       11.83      (2.0%)   11.6% (   7% -   15%) 0.000
                     OrStopWords       11.41      (2.8%)       13.26      (3.0%)   16.2% (  10% -   22%) 0.000

@gf2121
Copy link
Contributor Author

gf2121 commented Jul 10, 2025

Some more data:

Mac M2

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                       CountTerm    12276.27     (12.1%)    11998.30      (7.3%)   -2.3% ( -19% -   19%) 0.563
                   TermMonthSort     4162.59      (8.8%)     4111.81      (3.4%)   -1.2% ( -12% -   12%) 0.641
                CountAndHighHigh       84.34      (2.6%)       83.54      (2.5%)   -0.9% (  -5% -    4%) 0.342
          CountFilteredOrHighMed       48.75      (4.9%)       48.31      (3.8%)   -0.9% (  -9% -    8%) 0.591
         CountFilteredOrHighHigh       39.65      (4.2%)       39.30      (3.2%)   -0.9% (  -7% -    6%) 0.543

                                                    ...

                      OrHighHigh       48.67     (12.7%)       52.67      (2.7%)    8.2% (  -6% -   27%) 0.023
                    AndStopWords       16.25      (9.7%)       17.63      (4.3%)    8.5% (  -5% -   24%) 0.004
                     AndHighHigh       50.29     (13.5%)       55.32      (2.5%)   10.0% (  -5% -   30%) 0.009
                     OrStopWords       18.18     (10.6%)       20.61      (3.1%)   13.4% (   0% -   30%) 0.000

AVX512 (mentioned above)

TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                 FilteredPrefix3       76.86      (3.2%)       76.20      (5.5%)   -0.9% (  -9% -    8%) 0.552
                         Prefix3       81.74      (3.1%)       81.13      (4.9%)   -0.7% (  -8% -    7%) 0.567
                AndMedOrHighHigh       22.83      (1.8%)       22.78      (2.0%)   -0.3% (  -3% -    3%) 0.670
                   TermMonthSort     1184.75      (2.7%)     1182.73      (8.4%)   -0.2% ( -10% -   11%) 0.931
              CombinedOrHighHigh        7.95      (2.8%)        7.94      (2.7%)   -0.2% (  -5% -    5%) 0.844
             And2Terms2StopWords       66.71      (2.5%)       66.61      (4.6%)   -0.1% (  -7% -    7%) 0.903

                                                    ...
 
                       CountTerm     2847.75      (5.6%)     2897.40      (8.8%)    1.7% ( -11% -   17%) 0.454
                      OrHighHigh       29.50      (2.0%)       30.17      (2.9%)    2.3% (  -2% -    7%) 0.004
                          IntSet      150.21      (4.2%)      154.28      (4.6%)    2.7% (  -5% -   11%) 0.051
                     AndHighHigh       30.04      (1.9%)       31.63      (3.1%)    5.3% (   0% -   10%) 0.000
                    AndStopWords       10.60      (1.9%)       11.83      (2.0%)   11.6% (   7% -   15%) 0.000
                     OrStopWords       11.41      (2.8%)       13.26      (3.0%)   16.2% (  10% -   22%) 0.000

same AVX512 machine without --add-modules=jdk.incubator.vector

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                 FilteredPrefix3       74.47      (3.7%)       73.32      (4.0%)   -1.5% (  -8% -    6%) 0.210
                         Prefix3       79.35      (3.9%)       78.23      (3.5%)   -1.4% (  -8% -    6%) 0.232
                       CountTerm     2921.10      (6.2%)     2897.44      (7.5%)   -0.8% ( -13% -   13%) 0.708
             And2Terms2StopWords       62.09      (1.7%)       61.80      (2.6%)   -0.5% (  -4% -    3%) 0.482

                                                    ...
 
                      OrHighHigh       27.33      (2.4%)       27.66      (2.0%)    1.2% (  -3% -    5%) 0.092
                      OrHighRare      116.98      (3.3%)      118.89      (2.7%)    1.6% (  -4% -    7%) 0.088
                     AndHighHigh       27.52      (2.0%)       28.02      (1.5%)    1.8% (  -1% -    5%) 0.001
                    AndStopWords       10.60      (3.0%)       11.01      (1.7%)    3.8% (   0% -    8%) 0.000
                     OrStopWords       11.26      (3.6%)       11.97      (2.2%)    6.3% (   0% -   12%) 0.000

@gf2121 gf2121 marked this pull request as ready for review July 10, 2025 16:36
Copy link

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

@jpountz
Copy link
Contributor

jpountz commented Jul 10, 2025

This is very cool and the speedup makes sense to me. When dynamic pruning is enabled, only queries whose leading clauses are dense benefit significantly from this speedup (OrStopWords and AndStopWords). But if you evaluated exhaustive evaluation, I'm sure we'd be seeing a bigger speedup on all disjunctive queries that have one dense postings list or more.

Like for #14896, I'd like to split this PR in two: one where we merge your scalar improvements, and then this one where we add support for vectorization. By the way, we may want to look into other approaches for the scalar case. Since we only use bit sets in postings when many bits would be set, a linear scan should perform quite efficiently? (foreach (bit in 0..n) { if bitSet.get(bit) out.append(bit); }) I imagine that you used a micro benchmark to come up with your manual unrolling, let's include this micro benchmark in the PR?

Copy link

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

@gf2121
Copy link
Contributor Author

gf2121 commented Jul 13, 2025

JMH results with the vectorized implementations:

Benchmark                                                (bitCount)   Mode  Cnt   Score   Error   Units
BitsetToArrayBenchmark.dense                                      5  thrpt    5   9.583 ± 0.238  ops/us
BitsetToArrayBenchmark.dense                                     10  thrpt    5   6.926 ± 0.151  ops/us
BitsetToArrayBenchmark.dense                                     20  thrpt    5   4.597 ± 0.042  ops/us
BitsetToArrayBenchmark.dense                                     30  thrpt    5   3.420 ± 0.033  ops/us
BitsetToArrayBenchmark.dense                                     40  thrpt    5   3.766 ± 0.013  ops/us
BitsetToArrayBenchmark.dense                                     50  thrpt    5   5.299 ± 0.126  ops/us
BitsetToArrayBenchmark.dense                                     60  thrpt    5   8.991 ± 0.223  ops/us
BitsetToArrayBenchmark.denseBranchLess                            5  thrpt    5  13.520 ± 0.132  ops/us
BitsetToArrayBenchmark.denseBranchLess                           10  thrpt    5  13.440 ± 0.575  ops/us
BitsetToArrayBenchmark.denseBranchLess                           20  thrpt    5  13.521 ± 0.289  ops/us
BitsetToArrayBenchmark.denseBranchLess                           30  thrpt    5  13.488 ± 0.641  ops/us
BitsetToArrayBenchmark.denseBranchLess                           40  thrpt    5  13.501 ± 0.375  ops/us
BitsetToArrayBenchmark.denseBranchLess                           50  thrpt    5  13.555 ± 0.384  ops/us
BitsetToArrayBenchmark.denseBranchLess                           60  thrpt    5  13.524 ± 0.498  ops/us
BitsetToArrayBenchmark.denseBranchLessCmov                        5  thrpt    5   8.521 ± 0.120  ops/us
BitsetToArrayBenchmark.denseBranchLessCmov                       10  thrpt    5   6.315 ± 0.164  ops/us
BitsetToArrayBenchmark.denseBranchLessCmov                       20  thrpt    5  11.531 ± 0.176  ops/us
BitsetToArrayBenchmark.denseBranchLessCmov                       30  thrpt    5  11.493 ± 0.255  ops/us
BitsetToArrayBenchmark.denseBranchLessCmov                       40  thrpt    5  11.535 ± 0.018  ops/us
BitsetToArrayBenchmark.denseBranchLessCmov                       50  thrpt    5  11.539 ± 0.084  ops/us
BitsetToArrayBenchmark.denseBranchLessCmov                       60  thrpt    5   9.100 ± 0.017  ops/us
BitsetToArrayBenchmark.denseBranchLessParallel                    5  thrpt    5  15.428 ± 0.155  ops/us
BitsetToArrayBenchmark.denseBranchLessParallel                   10  thrpt    5  15.424 ± 0.282  ops/us
BitsetToArrayBenchmark.denseBranchLessParallel                   20  thrpt    5  15.375 ± 0.341  ops/us
BitsetToArrayBenchmark.denseBranchLessParallel                   30  thrpt    5  15.395 ± 0.121  ops/us
BitsetToArrayBenchmark.denseBranchLessParallel                   40  thrpt    5  15.308 ± 0.407  ops/us
BitsetToArrayBenchmark.denseBranchLessParallel                   50  thrpt    5  15.322 ± 0.174  ops/us
BitsetToArrayBenchmark.denseBranchLessParallel                   60  thrpt    5  15.439 ± 0.064  ops/us
BitsetToArrayBenchmark.denseBranchLessUnrolling                   5  thrpt    5  15.795 ± 0.380  ops/us
BitsetToArrayBenchmark.denseBranchLessUnrolling                  10  thrpt    5  15.827 ± 0.228  ops/us
BitsetToArrayBenchmark.denseBranchLessUnrolling                  20  thrpt    5  15.672 ± 0.991  ops/us
BitsetToArrayBenchmark.denseBranchLessUnrolling                  30  thrpt    5  15.789 ± 0.327  ops/us
BitsetToArrayBenchmark.denseBranchLessUnrolling                  40  thrpt    5  15.764 ± 0.350  ops/us
BitsetToArrayBenchmark.denseBranchLessUnrolling                  50  thrpt    5  15.725 ± 0.393  ops/us
BitsetToArrayBenchmark.denseBranchLessUnrolling                  60  thrpt    5  15.868 ± 0.028  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized                  5  thrpt    5  25.889 ± 0.471  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized                 10  thrpt    5  25.975 ± 0.129  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized                 20  thrpt    5  25.852 ± 0.299  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized                 30  thrpt    5  25.888 ± 0.371  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized                 40  thrpt    5  25.708 ± 1.028  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized                 50  thrpt    5  25.856 ± 0.612  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized                 60  thrpt    5  25.931 ± 0.144  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512               5  thrpt    5  28.221 ± 0.545  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512              10  thrpt    5  28.306 ± 0.209  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512              20  thrpt    5  26.827 ± 1.704  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512              30  thrpt    5  27.027 ± 0.214  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512              40  thrpt    5  26.504 ± 0.909  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512              50  thrpt    5  25.725 ± 0.084  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512              60  thrpt    5  25.495 ± 1.521  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512AVX2           5  thrpt    5   1.137 ± 0.473  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512AVX2          10  thrpt    5   0.856 ± 0.312  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512AVX2          20  thrpt    5   0.171 ± 0.091  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512AVX2          30  thrpt    5   0.159 ± 0.072  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512AVX2          40  thrpt    5   0.097 ± 0.042  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512AVX2          50  thrpt    5   0.069 ± 0.021  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512AVX2          60  thrpt    5   0.068 ± 0.041  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorizedAVX2              5  thrpt    5  20.310 ± 0.139  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorizedAVX2             10  thrpt    5  20.125 ± 0.352  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorizedAVX2             20  thrpt    5  19.961 ± 0.653  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorizedAVX2             30  thrpt    5  20.025 ± 1.040  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorizedAVX2             40  thrpt    5  20.051 ± 0.556  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorizedAVX2             50  thrpt    5  20.128 ± 0.131  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorizedAVX2             60  thrpt    5  19.769 ± 2.266  ops/us
BitsetToArrayBenchmark.denseInvert                                5  thrpt    5  19.958 ± 0.355  ops/us
BitsetToArrayBenchmark.denseInvert                               10  thrpt    5  13.497 ± 0.826  ops/us
BitsetToArrayBenchmark.denseInvert                               20  thrpt    5   6.995 ± 0.093  ops/us
BitsetToArrayBenchmark.denseInvert                               30  thrpt    5   4.579 ± 0.035  ops/us
BitsetToArrayBenchmark.denseInvert                               40  thrpt    5   4.447 ± 0.028  ops/us
BitsetToArrayBenchmark.denseInvert                               50  thrpt    5   4.082 ± 0.051  ops/us
BitsetToArrayBenchmark.denseInvert                               60  thrpt    5   6.732 ± 0.145  ops/us
BitsetToArrayBenchmark.forLoop                                    5  thrpt    5  26.332 ± 0.080  ops/us
BitsetToArrayBenchmark.forLoop                                   10  thrpt    5  21.765 ± 0.029  ops/us
BitsetToArrayBenchmark.forLoop                                   20  thrpt    5  15.878 ± 0.247  ops/us
BitsetToArrayBenchmark.forLoop                                   30  thrpt    5  12.606 ± 0.251  ops/us
BitsetToArrayBenchmark.forLoop                                   40  thrpt    5  10.440 ± 0.036  ops/us
BitsetToArrayBenchmark.forLoop                                   50  thrpt    5   8.875 ± 0.164  ops/us
BitsetToArrayBenchmark.forLoop                                   60  thrpt    5   7.735 ± 0.171  ops/us
BitsetToArrayBenchmark.forLoopManualUnrolling                     5  thrpt    5  26.018 ± 0.586  ops/us
BitsetToArrayBenchmark.forLoopManualUnrolling                    10  thrpt    5  21.031 ± 0.364  ops/us
BitsetToArrayBenchmark.forLoopManualUnrolling                    20  thrpt    5  15.683 ± 0.266  ops/us
BitsetToArrayBenchmark.forLoopManualUnrolling                    30  thrpt    5  12.502 ± 0.056  ops/us
BitsetToArrayBenchmark.forLoopManualUnrolling                    40  thrpt    5  10.330 ± 0.212  ops/us
BitsetToArrayBenchmark.forLoopManualUnrolling                    50  thrpt    5   8.842 ± 0.020  ops/us
BitsetToArrayBenchmark.forLoopManualUnrolling                    60  thrpt    5   7.705 ± 0.172  ops/us
BitsetToArrayBenchmark.hybrid                                     5  thrpt    5  25.588 ± 0.491  ops/us
BitsetToArrayBenchmark.hybrid                                    10  thrpt    5  21.151 ± 0.403  ops/us
BitsetToArrayBenchmark.hybrid                                    20  thrpt    5  15.653 ± 0.263  ops/us
BitsetToArrayBenchmark.hybrid                                    30  thrpt    5  12.431 ± 0.027  ops/us
BitsetToArrayBenchmark.hybrid                                    40  thrpt    5  15.414 ± 0.032  ops/us
BitsetToArrayBenchmark.hybrid                                    50  thrpt    5  15.415 ± 0.065  ops/us
BitsetToArrayBenchmark.hybrid                                    60  thrpt    5  15.188 ± 0.806  ops/us
BitsetToArrayBenchmark.whileLoop                                  5  thrpt    5  29.224 ± 0.503  ops/us
BitsetToArrayBenchmark.whileLoop                                 10  thrpt    5  23.237 ± 0.697  ops/us
BitsetToArrayBenchmark.whileLoop                                 20  thrpt    5  16.777 ± 0.278  ops/us
BitsetToArrayBenchmark.whileLoop                                 30  thrpt    5  13.019 ± 0.213  ops/us
BitsetToArrayBenchmark.whileLoop                                 40  thrpt    5  10.700 ± 0.095  ops/us
BitsetToArrayBenchmark.whileLoop                                 50  thrpt    5   9.047 ± 0.015  ops/us
BitsetToArrayBenchmark.whileLoop                                 60  thrpt    5   7.786 ± 0.224  ops/us

@jpountz
Copy link
Contributor

jpountz commented Jul 13, 2025

Thank you for updating the benchmark. I suggest we first figure how to handle compress() on #14896 before coming back to this PR.

Copy link

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

@gf2121
Copy link
Contributor Author

gf2121 commented Jul 13, 2025

I suggest we first figure how to handle compress() on #14896 before coming back to this PR.

+1, I'm tracking this PR as well.

Copy link

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

Copy link

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants