Vectorize bitset to array #14910

gf2121 · 2025-07-07T12:22:40Z

This is a minimal prof to describe an idea about how to vectorize a bitset into an array, which can be a hot path when posting is encoded as a bitset. This version currently only runs on AVX512, but can be adapted to more in the future.

Benchmark                             (bitSetSize)   Mode  Cnt      Score      Error   Units
BitsetToArrayBenchmark.baseline                128  thrpt    5   5477.202 ±   36.920  ops/ms
BitsetToArrayBenchmark.baseline                256  thrpt    5   6197.595 ±   92.064  ops/ms
BitsetToArrayBenchmark.baseline                512  thrpt    5   7121.446 ±  113.840  ops/ms
BitsetToArrayBenchmark.baseline                768  thrpt    5   7361.335 ±  286.118  ops/ms
BitsetToArrayBenchmark.vectorized512           128  thrpt    5  85321.831 ± 1539.445  ops/ms
BitsetToArrayBenchmark.vectorized512           256  thrpt    5  58632.773 ± 1130.691  ops/ms
BitsetToArrayBenchmark.vectorized512           512  thrpt    5  48780.092 ±  958.403  ops/ms
BitsetToArrayBenchmark.vectorized512           768  thrpt    5  29373.799 ±  392.238  ops/ms

github-actions · 2025-07-07T12:23:35Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

uschindler · 2025-07-07T15:32:47Z

This cannot be merged without adding this to the java24 part and reoving the requires of incubator module for JMH.

I assume this is only meant for quick checks and stays draft?

gf2121 · 2025-07-07T15:48:35Z

Thanks for reminding!

I assume this is only meant for quick checks and stays draft?

Yes, after the code integrated into VectorUtil benchmark will call VectorUtil directly and remove the requirement for the incubator module, just like other benchmarks.

gf2121 · 2025-07-09T11:20:20Z

I managed to get some luceneutil data on AVX512

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                 FilteredPrefix3       76.86      (3.2%)       76.20      (5.5%)   -0.9% (  -9% -    8%) 0.552
                         Prefix3       81.74      (3.1%)       81.13      (4.9%)   -0.7% (  -8% -    7%) 0.567
                AndMedOrHighHigh       22.83      (1.8%)       22.78      (2.0%)   -0.3% (  -3% -    3%) 0.670
                   TermMonthSort     1184.75      (2.7%)     1182.73      (8.4%)   -0.2% ( -10% -   11%) 0.931
              CombinedOrHighHigh        7.95      (2.8%)        7.94      (2.7%)   -0.2% (  -5% -    5%) 0.844
             And2Terms2StopWords       66.71      (2.5%)       66.61      (4.6%)   -0.1% (  -7% -    7%) 0.903
                          Fuzzy1       33.29      (2.5%)       33.26      (3.4%)   -0.1% (  -5% -    5%) 0.929
             FilteredOrStopWords       11.72      (2.0%)       11.71      (3.5%)   -0.1% (  -5% -    5%) 0.929
                    FilteredTerm       68.60      (1.6%)       68.56      (4.1%)   -0.0% (  -5% -    5%) 0.962
             CountFilteredIntNRQ       24.33      (1.1%)       24.33      (2.4%)   -0.0% (  -3% -    3%) 0.991
                          IntNRQ       55.49      (1.2%)       55.51      (3.1%)    0.0% (  -4% -    4%) 0.959
      FilteredOr2Terms2StopWords       59.21      (2.1%)       59.27      (4.5%)    0.1% (  -6% -    6%) 0.929
                 CountOrHighHigh       63.85      (1.0%)       63.93      (2.4%)    0.1% (  -3% -    3%) 0.827
             CountFilteredPhrase       12.29      (1.6%)       12.31      (1.9%)    0.1% (  -3% -    3%) 0.800
                      AndHighMed       72.10      (2.0%)       72.23      (4.2%)    0.2% (  -5% -    6%) 0.860
                          Phrase        9.76      (2.2%)        9.78      (3.3%)    0.2% (  -5% -    5%) 0.823
                  CountOrHighMed       93.67      (1.3%)       93.87      (2.7%)    0.2% (  -3% -    4%) 0.746
                 CountAndHighMed       90.17      (1.0%)       90.37      (2.2%)    0.2% (  -2% -    3%) 0.690
                      DismaxTerm      331.70      (3.2%)      332.55      (6.0%)    0.3% (  -8% -    9%) 0.867
               FilteredAnd3Terms      105.66      (2.0%)      105.96      (2.9%)    0.3% (  -4% -    5%) 0.717
          CountFilteredOrHighMed       32.69      (1.4%)       32.78      (2.0%)    0.3% (  -3% -    3%) 0.588
                          Fuzzy2       30.47      (2.4%)       30.57      (3.4%)    0.3% (  -5% -    6%) 0.724
              Or2Terms2StopWords       68.51      (2.3%)       68.74      (4.7%)    0.3% (  -6% -    7%) 0.765
             CombinedAndHighHigh        8.07      (2.1%)        8.10      (2.2%)    0.4% (  -3% -    4%) 0.596
                   TermTitleSort       59.34      (2.8%)       59.55      (4.0%)    0.4% (  -6% -    7%) 0.739
                        Wildcard       48.97      (3.6%)       49.16      (4.6%)    0.4% (  -7% -    8%) 0.766
                CountAndHighHigh       63.48      (1.3%)       63.74      (2.2%)    0.4% (  -3% -    3%) 0.472
                      TermDTSort      191.53      (2.0%)      192.32      (5.5%)    0.4% (  -7% -    8%) 0.752
             FilteredAndHighHigh       16.87      (1.3%)       16.94      (2.3%)    0.4% (  -3% -    4%) 0.457
            FilteredAndStopWords       13.70      (1.6%)       13.76      (2.2%)    0.4% (  -3% -    4%) 0.468
         CountFilteredOrHighHigh       27.44      (0.9%)       27.56      (1.7%)    0.4% (  -2% -    3%) 0.305
                         Respell       27.54      (2.1%)       27.66      (2.4%)    0.4% (  -3% -    4%) 0.534
                    CombinedTerm       16.58      (2.9%)       16.66      (3.0%)    0.5% (  -5% -    6%) 0.621
                       OrHighMed       87.71      (2.3%)       88.12      (4.9%)    0.5% (  -6% -    7%) 0.702
                            Term      421.54      (3.5%)      423.51      (5.9%)    0.5% (  -8% -   10%) 0.761
                  FilteredIntNRQ       54.79      (1.9%)       55.09      (2.7%)    0.5% (  -4% -    5%) 0.468
              FilteredOrHighHigh       18.14      (1.8%)       18.24      (3.5%)    0.5% (  -4% -    5%) 0.539
                 DismaxOrHighMed       57.31      (1.9%)       57.65      (5.4%)    0.6% (  -6% -    8%) 0.647
     FilteredAnd2Terms2StopWords       69.42      (1.9%)       69.84      (3.1%)    0.6% (  -4% -    5%) 0.450
               TermDayOfYearSort      317.00      (2.3%)      319.07      (3.9%)    0.7% (  -5% -    6%) 0.515
              FilteredAndHighMed       46.77      (1.5%)       47.11      (2.6%)    0.7% (  -3% -    4%) 0.270
                      OrHighRare      116.92      (4.6%)      117.89      (5.9%)    0.8% (  -9% -   11%) 0.620
                 AndHighOrMedMed       21.55      (2.1%)       21.74      (1.9%)    0.9% (  -3% -    5%) 0.172
                FilteredOr3Terms       52.71      (1.9%)       53.18      (4.2%)    0.9% (  -5% -    7%) 0.386
                  FilteredPhrase       12.77      (1.8%)       12.89      (3.2%)    0.9% (  -4% -    6%) 0.262
               FilteredOrHighMed       50.52      (2.4%)       50.99      (4.5%)    0.9% (  -5% -    8%) 0.416
               CombinedOrHighMed       28.48      (2.3%)       28.76      (4.4%)    1.0% (  -5% -    7%) 0.392
                DismaxOrHighHigh       39.77      (1.9%)       40.16      (3.4%)    1.0% (  -4% -    6%) 0.256
                       And3Terms       84.76      (2.1%)       85.63      (3.6%)    1.0% (  -4% -    6%) 0.272
                        Or3Terms       76.27      (1.3%)       77.08      (3.7%)    1.1% (  -3% -    6%) 0.226
              CombinedAndHighMed       29.04      (2.3%)       29.39      (4.1%)    1.2% (  -5% -    7%) 0.252
                        PKLookup       75.51      (1.3%)       76.44      (3.2%)    1.2% (  -3% -    5%) 0.112
                       CountTerm     2847.75      (5.6%)     2897.40      (8.8%)    1.7% ( -11% -   17%) 0.454
                      OrHighHigh       29.50      (2.0%)       30.17      (2.9%)    2.3% (  -2% -    7%) 0.004
                          IntSet      150.21      (4.2%)      154.28      (4.6%)    2.7% (  -5% -   11%) 0.051
                     AndHighHigh       30.04      (1.9%)       31.63      (3.1%)    5.3% (   0% -   10%) 0.000
                    AndStopWords       10.60      (1.9%)       11.83      (2.0%)   11.6% (   7% -   15%) 0.000
                     OrStopWords       11.41      (2.8%)       13.26      (3.0%)   16.2% (  10% -   22%) 0.000

gf2121 · 2025-07-10T16:36:06Z

Some more data:

Mac M2

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                       CountTerm    12276.27     (12.1%)    11998.30      (7.3%)   -2.3% ( -19% -   19%) 0.563
                   TermMonthSort     4162.59      (8.8%)     4111.81      (3.4%)   -1.2% ( -12% -   12%) 0.641
                CountAndHighHigh       84.34      (2.6%)       83.54      (2.5%)   -0.9% (  -5% -    4%) 0.342
          CountFilteredOrHighMed       48.75      (4.9%)       48.31      (3.8%)   -0.9% (  -9% -    8%) 0.591
         CountFilteredOrHighHigh       39.65      (4.2%)       39.30      (3.2%)   -0.9% (  -7% -    6%) 0.543

                                                    ...

                      OrHighHigh       48.67     (12.7%)       52.67      (2.7%)    8.2% (  -6% -   27%) 0.023
                    AndStopWords       16.25      (9.7%)       17.63      (4.3%)    8.5% (  -5% -   24%) 0.004
                     AndHighHigh       50.29     (13.5%)       55.32      (2.5%)   10.0% (  -5% -   30%) 0.009
                     OrStopWords       18.18     (10.6%)       20.61      (3.1%)   13.4% (   0% -   30%) 0.000

AVX512 (mentioned above)

TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                 FilteredPrefix3       76.86      (3.2%)       76.20      (5.5%)   -0.9% (  -9% -    8%) 0.552
                         Prefix3       81.74      (3.1%)       81.13      (4.9%)   -0.7% (  -8% -    7%) 0.567
                AndMedOrHighHigh       22.83      (1.8%)       22.78      (2.0%)   -0.3% (  -3% -    3%) 0.670
                   TermMonthSort     1184.75      (2.7%)     1182.73      (8.4%)   -0.2% ( -10% -   11%) 0.931
              CombinedOrHighHigh        7.95      (2.8%)        7.94      (2.7%)   -0.2% (  -5% -    5%) 0.844
             And2Terms2StopWords       66.71      (2.5%)       66.61      (4.6%)   -0.1% (  -7% -    7%) 0.903

                                                    ...
 
                       CountTerm     2847.75      (5.6%)     2897.40      (8.8%)    1.7% ( -11% -   17%) 0.454
                      OrHighHigh       29.50      (2.0%)       30.17      (2.9%)    2.3% (  -2% -    7%) 0.004
                          IntSet      150.21      (4.2%)      154.28      (4.6%)    2.7% (  -5% -   11%) 0.051
                     AndHighHigh       30.04      (1.9%)       31.63      (3.1%)    5.3% (   0% -   10%) 0.000
                    AndStopWords       10.60      (1.9%)       11.83      (2.0%)   11.6% (   7% -   15%) 0.000
                     OrStopWords       11.41      (2.8%)       13.26      (3.0%)   16.2% (  10% -   22%) 0.000

same AVX512 machine without --add-modules=jdk.incubator.vector

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                 FilteredPrefix3       74.47      (3.7%)       73.32      (4.0%)   -1.5% (  -8% -    6%) 0.210
                         Prefix3       79.35      (3.9%)       78.23      (3.5%)   -1.4% (  -8% -    6%) 0.232
                       CountTerm     2921.10      (6.2%)     2897.44      (7.5%)   -0.8% ( -13% -   13%) 0.708
             And2Terms2StopWords       62.09      (1.7%)       61.80      (2.6%)   -0.5% (  -4% -    3%) 0.482

                                                    ...
 
                      OrHighHigh       27.33      (2.4%)       27.66      (2.0%)    1.2% (  -3% -    5%) 0.092
                      OrHighRare      116.98      (3.3%)      118.89      (2.7%)    1.6% (  -4% -    7%) 0.088
                     AndHighHigh       27.52      (2.0%)       28.02      (1.5%)    1.8% (  -1% -    5%) 0.001
                    AndStopWords       10.60      (3.0%)       11.01      (1.7%)    3.8% (   0% -    8%) 0.000
                     OrStopWords       11.26      (3.6%)       11.97      (2.2%)    6.3% (   0% -   12%) 0.000

github-actions · 2025-07-10T16:37:41Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

jpountz · 2025-07-10T21:32:42Z

This is very cool and the speedup makes sense to me. When dynamic pruning is enabled, only queries whose leading clauses are dense benefit significantly from this speedup (OrStopWords and AndStopWords). But if you evaluated exhaustive evaluation, I'm sure we'd be seeing a bigger speedup on all disjunctive queries that have one dense postings list or more.

Like for #14896, I'd like to split this PR in two: one where we merge your scalar improvements, and then this one where we add support for vectorization. By the way, we may want to look into other approaches for the scalar case. Since we only use bit sets in postings when many bits would be set, a linear scan should perform quite efficiently? (foreach (bit in 0..n) { if bitSet.get(bit) out.append(bit); }) I imagine that you used a micro benchmark to come up with your manual unrolling, let's include this micro benchmark in the PR?

…rray

github-actions · 2025-07-13T07:03:26Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

gf2121 · 2025-07-13T07:13:58Z

JMH results with the vectorized implementations:

Benchmark                                                (bitCount)   Mode  Cnt   Score   Error   Units
BitsetToArrayBenchmark.dense                                      5  thrpt    5   9.583 ± 0.238  ops/us
BitsetToArrayBenchmark.dense                                     10  thrpt    5   6.926 ± 0.151  ops/us
BitsetToArrayBenchmark.dense                                     20  thrpt    5   4.597 ± 0.042  ops/us
BitsetToArrayBenchmark.dense                                     30  thrpt    5   3.420 ± 0.033  ops/us
BitsetToArrayBenchmark.dense                                     40  thrpt    5   3.766 ± 0.013  ops/us
BitsetToArrayBenchmark.dense                                     50  thrpt    5   5.299 ± 0.126  ops/us
BitsetToArrayBenchmark.dense                                     60  thrpt    5   8.991 ± 0.223  ops/us
BitsetToArrayBenchmark.denseBranchLess                            5  thrpt    5  13.520 ± 0.132  ops/us
BitsetToArrayBenchmark.denseBranchLess                           10  thrpt    5  13.440 ± 0.575  ops/us
BitsetToArrayBenchmark.denseBranchLess                           20  thrpt    5  13.521 ± 0.289  ops/us
BitsetToArrayBenchmark.denseBranchLess                           30  thrpt    5  13.488 ± 0.641  ops/us
BitsetToArrayBenchmark.denseBranchLess                           40  thrpt    5  13.501 ± 0.375  ops/us
BitsetToArrayBenchmark.denseBranchLess                           50  thrpt    5  13.555 ± 0.384  ops/us
BitsetToArrayBenchmark.denseBranchLess                           60  thrpt    5  13.524 ± 0.498  ops/us
BitsetToArrayBenchmark.denseBranchLessCmov                        5  thrpt    5   8.521 ± 0.120  ops/us
BitsetToArrayBenchmark.denseBranchLessCmov                       10  thrpt    5   6.315 ± 0.164  ops/us
BitsetToArrayBenchmark.denseBranchLessCmov                       20  thrpt    5  11.531 ± 0.176  ops/us
BitsetToArrayBenchmark.denseBranchLessCmov                       30  thrpt    5  11.493 ± 0.255  ops/us
BitsetToArrayBenchmark.denseBranchLessCmov                       40  thrpt    5  11.535 ± 0.018  ops/us
BitsetToArrayBenchmark.denseBranchLessCmov                       50  thrpt    5  11.539 ± 0.084  ops/us
BitsetToArrayBenchmark.denseBranchLessCmov                       60  thrpt    5   9.100 ± 0.017  ops/us
BitsetToArrayBenchmark.denseBranchLessParallel                    5  thrpt    5  15.428 ± 0.155  ops/us
BitsetToArrayBenchmark.denseBranchLessParallel                   10  thrpt    5  15.424 ± 0.282  ops/us
BitsetToArrayBenchmark.denseBranchLessParallel                   20  thrpt    5  15.375 ± 0.341  ops/us
BitsetToArrayBenchmark.denseBranchLessParallel                   30  thrpt    5  15.395 ± 0.121  ops/us
BitsetToArrayBenchmark.denseBranchLessParallel                   40  thrpt    5  15.308 ± 0.407  ops/us
BitsetToArrayBenchmark.denseBranchLessParallel                   50  thrpt    5  15.322 ± 0.174  ops/us
BitsetToArrayBenchmark.denseBranchLessParallel                   60  thrpt    5  15.439 ± 0.064  ops/us
BitsetToArrayBenchmark.denseBranchLessUnrolling                   5  thrpt    5  15.795 ± 0.380  ops/us
BitsetToArrayBenchmark.denseBranchLessUnrolling                  10  thrpt    5  15.827 ± 0.228  ops/us
BitsetToArrayBenchmark.denseBranchLessUnrolling                  20  thrpt    5  15.672 ± 0.991  ops/us
BitsetToArrayBenchmark.denseBranchLessUnrolling                  30  thrpt    5  15.789 ± 0.327  ops/us
BitsetToArrayBenchmark.denseBranchLessUnrolling                  40  thrpt    5  15.764 ± 0.350  ops/us
BitsetToArrayBenchmark.denseBranchLessUnrolling                  50  thrpt    5  15.725 ± 0.393  ops/us
BitsetToArrayBenchmark.denseBranchLessUnrolling                  60  thrpt    5  15.868 ± 0.028  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized                  5  thrpt    5  25.889 ± 0.471  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized                 10  thrpt    5  25.975 ± 0.129  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized                 20  thrpt    5  25.852 ± 0.299  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized                 30  thrpt    5  25.888 ± 0.371  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized                 40  thrpt    5  25.708 ± 1.028  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized                 50  thrpt    5  25.856 ± 0.612  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized                 60  thrpt    5  25.931 ± 0.144  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512               5  thrpt    5  28.221 ± 0.545  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512              10  thrpt    5  28.306 ± 0.209  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512              20  thrpt    5  26.827 ± 1.704  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512              30  thrpt    5  27.027 ± 0.214  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512              40  thrpt    5  26.504 ± 0.909  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512              50  thrpt    5  25.725 ± 0.084  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512              60  thrpt    5  25.495 ± 1.521  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512AVX2           5  thrpt    5   1.137 ± 0.473  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512AVX2          10  thrpt    5   0.856 ± 0.312  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512AVX2          20  thrpt    5   0.171 ± 0.091  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512AVX2          30  thrpt    5   0.159 ± 0.072  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512AVX2          40  thrpt    5   0.097 ± 0.042  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512AVX2          50  thrpt    5   0.069 ± 0.021  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorized512AVX2          60  thrpt    5   0.068 ± 0.041  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorizedAVX2              5  thrpt    5  20.310 ± 0.139  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorizedAVX2             10  thrpt    5  20.125 ± 0.352  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorizedAVX2             20  thrpt    5  19.961 ± 0.653  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorizedAVX2             30  thrpt    5  20.025 ± 1.040  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorizedAVX2             40  thrpt    5  20.051 ± 0.556  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorizedAVX2             50  thrpt    5  20.128 ± 0.131  ops/us
BitsetToArrayBenchmark.denseBranchLessVectorizedAVX2             60  thrpt    5  19.769 ± 2.266  ops/us
BitsetToArrayBenchmark.denseInvert                                5  thrpt    5  19.958 ± 0.355  ops/us
BitsetToArrayBenchmark.denseInvert                               10  thrpt    5  13.497 ± 0.826  ops/us
BitsetToArrayBenchmark.denseInvert                               20  thrpt    5   6.995 ± 0.093  ops/us
BitsetToArrayBenchmark.denseInvert                               30  thrpt    5   4.579 ± 0.035  ops/us
BitsetToArrayBenchmark.denseInvert                               40  thrpt    5   4.447 ± 0.028  ops/us
BitsetToArrayBenchmark.denseInvert                               50  thrpt    5   4.082 ± 0.051  ops/us
BitsetToArrayBenchmark.denseInvert                               60  thrpt    5   6.732 ± 0.145  ops/us
BitsetToArrayBenchmark.forLoop                                    5  thrpt    5  26.332 ± 0.080  ops/us
BitsetToArrayBenchmark.forLoop                                   10  thrpt    5  21.765 ± 0.029  ops/us
BitsetToArrayBenchmark.forLoop                                   20  thrpt    5  15.878 ± 0.247  ops/us
BitsetToArrayBenchmark.forLoop                                   30  thrpt    5  12.606 ± 0.251  ops/us
BitsetToArrayBenchmark.forLoop                                   40  thrpt    5  10.440 ± 0.036  ops/us
BitsetToArrayBenchmark.forLoop                                   50  thrpt    5   8.875 ± 0.164  ops/us
BitsetToArrayBenchmark.forLoop                                   60  thrpt    5   7.735 ± 0.171  ops/us
BitsetToArrayBenchmark.forLoopManualUnrolling                     5  thrpt    5  26.018 ± 0.586  ops/us
BitsetToArrayBenchmark.forLoopManualUnrolling                    10  thrpt    5  21.031 ± 0.364  ops/us
BitsetToArrayBenchmark.forLoopManualUnrolling                    20  thrpt    5  15.683 ± 0.266  ops/us
BitsetToArrayBenchmark.forLoopManualUnrolling                    30  thrpt    5  12.502 ± 0.056  ops/us
BitsetToArrayBenchmark.forLoopManualUnrolling                    40  thrpt    5  10.330 ± 0.212  ops/us
BitsetToArrayBenchmark.forLoopManualUnrolling                    50  thrpt    5   8.842 ± 0.020  ops/us
BitsetToArrayBenchmark.forLoopManualUnrolling                    60  thrpt    5   7.705 ± 0.172  ops/us
BitsetToArrayBenchmark.hybrid                                     5  thrpt    5  25.588 ± 0.491  ops/us
BitsetToArrayBenchmark.hybrid                                    10  thrpt    5  21.151 ± 0.403  ops/us
BitsetToArrayBenchmark.hybrid                                    20  thrpt    5  15.653 ± 0.263  ops/us
BitsetToArrayBenchmark.hybrid                                    30  thrpt    5  12.431 ± 0.027  ops/us
BitsetToArrayBenchmark.hybrid                                    40  thrpt    5  15.414 ± 0.032  ops/us
BitsetToArrayBenchmark.hybrid                                    50  thrpt    5  15.415 ± 0.065  ops/us
BitsetToArrayBenchmark.hybrid                                    60  thrpt    5  15.188 ± 0.806  ops/us
BitsetToArrayBenchmark.whileLoop                                  5  thrpt    5  29.224 ± 0.503  ops/us
BitsetToArrayBenchmark.whileLoop                                 10  thrpt    5  23.237 ± 0.697  ops/us
BitsetToArrayBenchmark.whileLoop                                 20  thrpt    5  16.777 ± 0.278  ops/us
BitsetToArrayBenchmark.whileLoop                                 30  thrpt    5  13.019 ± 0.213  ops/us
BitsetToArrayBenchmark.whileLoop                                 40  thrpt    5  10.700 ± 0.095  ops/us
BitsetToArrayBenchmark.whileLoop                                 50  thrpt    5   9.047 ± 0.015  ops/us
BitsetToArrayBenchmark.whileLoop                                 60  thrpt    5   7.786 ± 0.224  ops/us

…rray

jpountz · 2025-07-13T07:33:26Z

Thank you for updating the benchmark. I suggest we first figure how to handle compress() on #14896 before coming back to this PR.

github-actions · 2025-07-13T07:33:41Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

gf2121 · 2025-07-13T07:43:52Z

I suggest we first figure how to handle compress() on #14896 before coming back to this PR.

+1, I'm tracking this PR as well.

github-actions · 2025-07-13T08:36:12Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

github-actions · 2025-07-13T17:15:14Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

vectorize_bitset_to_array

7c7b333

github-project-automation bot added this to OpenSearch Lucene & Core Performance Tracking Jul 7, 2025

github-project-automation bot moved this to Open in OpenSearch Lucene & Core Performance Tracking Jul 7, 2025

gf2121 marked this pull request as draft July 7, 2025 12:22

iter

08d7a73

iter

505562f

github-actions bot added the module:core/codecs label Jul 10, 2025

gf2121 added 3 commits July 10, 2025 23:43

iter

ab79f5e

iter

0f13d7a

iter

82a609c

license

df47bdd

gf2121 marked this pull request as ready for review July 10, 2025 16:36

gf2121 mentioned this pull request Jul 11, 2025

Optimize bitset to array #14935

Merged

gf2121 added 2 commits July 13, 2025 14:22

Merge remote-tracking branch 'origin/main' into vectorize_bitset_to_a…

52d80c7

…rray

follow another PR

7f9eddb

Merge remote-tracking branch 'origin/main' into vectorize_bitset_to_a…

0e49d07

…rray

iter

5b3a859

remove dense patch

a34de82

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vectorize bitset to array #14910

Vectorize bitset to array #14910

Uh oh!

gf2121 commented Jul 7, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 7, 2025

Uh oh!

uschindler commented Jul 7, 2025

Uh oh!

gf2121 commented Jul 7, 2025 •

edited

Loading

Uh oh!

gf2121 commented Jul 9, 2025

Uh oh!

gf2121 commented Jul 10, 2025

Uh oh!

github-actions bot commented Jul 10, 2025

Uh oh!

jpountz commented Jul 10, 2025

Uh oh!

github-actions bot commented Jul 13, 2025

Uh oh!

gf2121 commented Jul 13, 2025

Uh oh!

jpountz commented Jul 13, 2025

Uh oh!

github-actions bot commented Jul 13, 2025

Uh oh!

gf2121 commented Jul 13, 2025

Uh oh!

github-actions bot commented Jul 13, 2025

Uh oh!

github-actions bot commented Jul 13, 2025

Uh oh!

Uh oh!

Vectorize bitset to array #14910

Are you sure you want to change the base?

Vectorize bitset to array #14910

Uh oh!

Conversation

gf2121 commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jul 7, 2025

Uh oh!

uschindler commented Jul 7, 2025

Uh oh!

gf2121 commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gf2121 commented Jul 9, 2025

Uh oh!

gf2121 commented Jul 10, 2025

Uh oh!

github-actions bot commented Jul 10, 2025

Uh oh!

jpountz commented Jul 10, 2025

Uh oh!

github-actions bot commented Jul 13, 2025

Uh oh!

gf2121 commented Jul 13, 2025

Uh oh!

jpountz commented Jul 13, 2025

Uh oh!

github-actions bot commented Jul 13, 2025

Uh oh!

gf2121 commented Jul 13, 2025

Uh oh!

github-actions bot commented Jul 13, 2025

Uh oh!

github-actions bot commented Jul 13, 2025

Uh oh!

Uh oh!

gf2121 commented Jul 7, 2025 •

edited

Loading

gf2121 commented Jul 7, 2025 •

edited

Loading