Skip to content

Conversation

@e-dard
Copy link
Contributor

@e-dard e-dard commented Jul 13, 2021

Which issue does this PR close?

Closes #543

Rationale for this change

The cost of building a comparator (initialising a DynComparator) is often significantly higher than the actual cost of executing the comparator's closure on two row IDs. Therefore it makes sense to build the comparator once, and re-use the returned DynComparator for each row you are comparing the two arrays on.

Due to the explicit lifetime of DynComparator I recently found it problematic to store the DynComparator on an object that was used across threads in an async environment.

By refactoring the build_compare implementations I was able to remove the lifetime and it's much easier to use in the Datafusion operator I'm improving.

As a nice side-effect, the sort kernel benchmarks show an improvement in performance of around 2–5%.

$ critcmp master pr

group                                          master                          pr
-----                                          ------                          --
bool sort 2^12                                 1.03    310.8±1.34µs            1.00    302.8±7.78µs
bool sort nulls 2^12                           1.01    287.4±2.22µs            1.00    284.0±3.23µs
sort 2^10                                      1.04     98.7±3.58µs            1.00     94.6±0.50µs
sort 2^12                                      1.05    510.7±5.56µs            1.00    486.2±9.94µs
sort 2^12 limit 10                             1.05     48.1±0.38µs            1.00     45.6±0.30µs
sort 2^12 limit 100                            1.04     52.8±0.37µs            1.00     50.6±0.41µs
sort 2^12 limit 1000                           1.06    141.1±0.94µs            1.00    132.7±0.95µs
sort 2^12 limit 2^12                           1.03    501.2±4.01µs            1.00    486.5±4.87µs
sort nulls 2^10                                1.02     70.9±0.72µs            1.00     69.4±0.51µs
sort nulls 2^12                                1.02    369.7±3.51µs            1.00   363.0±18.52µs
sort nulls 2^12 limit 10                       1.01     70.6±1.22µs            1.00     70.0±1.27µs
sort nulls 2^12 limit 100                      1.00     71.7±0.82µs            1.00     71.8±1.60µs
sort nulls 2^12 limit 1000                     1.01     80.5±1.55µs            1.00     79.4±1.41µs
sort nulls 2^12 limit 2^12                     1.05    375.4±4.78µs            1.00    356.1±3.04µs

What changes are included in this PR?

I have refactored DynComparator to remove the explicit lifetime. I did this by removing the need for the DynComparator instantiations to close over references. Instead we just move in owned arrays by cheaply cloning the underlying ArrayData.

Further, I took the liberty of making DynComparator Send+Sync. Again, this helps when you want to store it on types that need to be Send + Sync.

Relative to how often the closure will be called I don't see this as an expensive change.

This commit removes the need for an explicit lifetime on the `DynComparator`.

The rationale behind this change is that callers may wish to share this comparator amongst threads and the explicit lifetime makes this harder to achieve.

As a nice side-effect, performance of the sort kernel seems to have improved:

```
$ critcmp master pr

group                                          master                          pr
-----                                          ------                          --
bool sort 2^12                                 1.03    310.8±1.34µs            1.00    302.8±7.78µs
bool sort nulls 2^12                           1.01    287.4±2.22µs            1.00    284.0±3.23µs
sort 2^10                                      1.04     98.7±3.58µs            1.00     94.6±0.50µs
sort 2^12                                      1.05    510.7±5.56µs            1.00    486.2±9.94µs
sort 2^12 limit 10                             1.05     48.1±0.38µs            1.00     45.6±0.30µs
sort 2^12 limit 100                            1.04     52.8±0.37µs            1.00     50.6±0.41µs
sort 2^12 limit 1000                           1.06    141.1±0.94µs            1.00    132.7±0.95µs
sort 2^12 limit 2^12                           1.03    501.2±4.01µs            1.00    486.5±4.87µs
sort nulls 2^10                                1.02     70.9±0.72µs            1.00     69.4±0.51µs
sort nulls 2^12                                1.02    369.7±3.51µs            1.00   363.0±18.52µs
sort nulls 2^12 limit 10                       1.01     70.6±1.22µs            1.00     70.0±1.27µs
sort nulls 2^12 limit 100                      1.00     71.7±0.82µs            1.00     71.8±1.60µs
sort nulls 2^12 limit 1000                     1.01     80.5±1.55µs            1.00     79.4±1.41µs
sort nulls 2^12 limit 2^12                     1.05    375.4±4.78µs            1.00    356.1±3.04µs
```
@github-actions github-actions bot added the arrow Changes to the arrow crate label Jul 13, 2021
Copy link
Member

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Thanks, @e-dard .

I am not sure why there is this performance improvement, but I agree that cloning the array is cheap and give us the added bonus of Send+Sync of the comparator that some use-cases benefit.

Could you add an issue and link it to this PR (via Closes #issue), so that this shows on the changelog?

Note that this backward incompatible as code using the lifetime qualifier stops compiling.

@alamb alamb added the api-change Changes to the arrow API label Jul 13, 2021
@alamb
Copy link
Contributor

alamb commented Jul 13, 2021

I have filed #543 issue for this change

@codecov-commenter
Copy link

Codecov Report

Merging #542 (5d21d29) into master (80a57be) will increase coverage by 0.06%.
The diff coverage is 81.25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #542      +/-   ##
==========================================
+ Coverage   82.54%   82.60%   +0.06%     
==========================================
  Files         167      167              
  Lines       45957    45984      +27     
==========================================
+ Hits        37934    37985      +51     
+ Misses       8023     7999      -24     
Impacted Files Coverage Δ
arrow/src/compute/kernels/sort.rs 94.15% <ø> (ø)
arrow/src/array/ord.rs 63.33% <81.25%> (ø)
arrow/src/array/array_string.rs 97.85% <0.00%> (+0.08%) ⬆️
arrow/src/compute/kernels/take.rs 94.72% <0.00%> (+0.28%) ⬆️
arrow/src/array/builder.rs 86.11% <0.00%> (+0.33%) ⬆️
arrow/src/buffer/immutable.rs 97.84% <0.00%> (+0.55%) ⬆️
arrow/src/buffer/mutable.rs 90.10% <0.00%> (+5.76%) ⬆️
arrow/src/array/transform/boolean.rs 84.61% <0.00%> (+7.69%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 80a57be...5d21d29. Read the comment docs.

@nevi-me nevi-me merged commit fde79a2 into apache:master Jul 14, 2021
@alamb alamb added api-change Changes to the arrow API and removed api-change Changes to the arrow API labels Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-change Changes to the arrow API arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove lifetime from DynComparator

5 participants