Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix string view LIKE checks with NULL values #6662

Conversation

findepi
Copy link
Member

@findepi findepi commented Oct 31, 2024

Which issue does this PR close?

Rationale for this change

Fix correctness

What changes are included in this PR?

Fix StringView LIKE .. result when tested values include nulls.

Are there any user-facing changes?

Yes

@findepi
Copy link
Member Author

findepi commented Oct 31, 2024

cc @goldmedal

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @findepi -- this is looking good. Nice find.

.map(|haystack| finder.find(haystack).is_some() != negate)
.enumerate()
.map(|(idx, haystack)| {
if nulls.as_ref().map(|n| n.is_null(idx)).unwrap_or_default() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my concern with this approach is that it adds code to the inner loop of the kernel.

We should definitely run benchmarks on this PR before merging

I think another way to solve this would be to either:

  1. Have two loops -- one when nulls was Some and the other when nulls was None (so we aren't checking each row)
  2. Leave the computation as is (don't check for nulls on the input) but instead simply copy the Nulls array from the input to the output (so any input that was null would also be null in the output)

Copy link
Contributor

@tustvold tustvold Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I am not sure why this isn't using from_unary like the other codepaths which would provide option 2, and be significantly more efficient than first collecting into a Vec...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now i am more concerned about correctness. Producing fast but wrong results isn't as useful.
i changed contains to to use from_unary, that seemed trivially good.
@tustvold might this be it doesn't use from_unary because it wants to use prefix_bytes_iter and suffix_bytes_iter? Please advise! I think we can switch all operations to from_unary to fix correctness and leave tuning performance to a follow-up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am running some benchmarks to see what, if any, impact this has in performance

cargo bench --bench comparison_kernels -- like

I will report back shortly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ilike_utf8 scalar complex                          1.00      2.7±0.05ms        ? ?/sec                            1.03      2.8±0.09ms        ? ?/sec
ilike_utf8 scalar contains                         1.01      4.2±0.05ms        ? ?/sec                            1.00      4.2±0.07ms        ? ?/sec
ilike_utf8 scalar ends with                        1.02  1279.8±38.11µs        ? ?/sec                            1.00  1258.4±37.50µs        ? ?/sec
ilike_utf8 scalar equals                           1.03   801.1±23.96µs        ? ?/sec                            1.00   775.4±20.05µs        ? ?/sec
ilike_utf8 scalar starts with                      1.00  1165.9±38.61µs        ? ?/sec                            1.02  1193.0±54.84µs        ? ?/sec
ilike_utf8_scalar_dyn dictionary[10] string[4])    1.00     77.7±0.09µs        ? ?/sec                            1.00     77.6±0.07µs        ? ?/sec
like_utf8 scalar complex                           1.00  1924.8±43.83µs        ? ?/sec                            1.00  1924.7±49.77µs        ? ?/sec
like_utf8 scalar contains                          1.00  1562.1±14.30µs        ? ?/sec                            1.00  1567.9±23.23µs        ? ?/sec
like_utf8 scalar ends with                         1.02   432.7±11.70µs        ? ?/sec                            1.00    425.2±5.10µs        ? ?/sec
like_utf8 scalar equals                            1.16    127.0±0.17µs        ? ?/sec                            1.00    109.6±0.20µs        ? ?/sec
like_utf8 scalar starts with                       1.00    339.2±4.42µs        ? ?/sec                            1.02   347.4±15.72µs        ? ?/sec
like_utf8_scalar_dyn dictionary[10] string[4])     1.00     77.6±0.20µs        ? ?/sec                            1.00     77.4±0.12µs        ? ?/sec
like_utf8view scalar complex                       1.00    182.8±0.58ms        ? ?/sec                            1.01    184.9±0.67ms        ? ?/sec
like_utf8view scalar contains                      1.02    141.3±0.34ms        ? ?/sec                            1.00    138.2±0.32ms        ? ?/sec
like_utf8view scalar ends with 13 bytes            1.49     66.4±0.42ms        ? ?/sec                            1.00     44.7±0.27ms        ? ?/sec
like_utf8view scalar ends with 4 bytes             1.45     66.1±0.33ms        ? ?/sec                            1.00     45.4±0.35ms        ? ?/sec
like_utf8view scalar ends with 6 bytes             1.46     66.0±0.35ms        ? ?/sec                            1.00     45.2±0.32ms        ? ?/sec
like_utf8view scalar equals                        1.00     33.9±0.12ms        ? ?/sec                            1.11     37.8±0.10ms        ? ?/sec
like_utf8view scalar starts with 13 bytes          1.55     68.6±0.36ms        ? ?/sec                            1.00     44.2±0.33ms        ? ?/sec
like_utf8view scalar starts with 4 bytes           1.57     42.6±0.18ms        ? ?/sec                            1.00     27.2±0.13ms        ? ?/sec
like_utf8view scalar starts with 6 bytes           1.53     69.0±0.32ms        ? ?/sec                            1.00     45.2±0.28ms        ? ?/sec
nilike_utf8 scalar complex                         1.00      2.7±0.07ms        ? ?/sec                            1.00      2.7±0.06ms        ? ?/sec
nilike_utf8 scalar contains                        1.02      4.2±0.05ms        ? ?/sec                            1.00      4.1±0.05ms        ? ?/sec
nilike_utf8 scalar ends with                       1.01  1259.7±41.60µs        ? ?/sec                            1.00  1247.7±26.01µs        ? ?/sec
nilike_utf8 scalar equals                          1.05   801.3±25.80µs        ? ?/sec                            1.00   765.8±17.79µs        ? ?/sec
nilike_utf8 scalar starts with                     1.02  1171.4±33.96µs        ? ?/sec                            1.00  1146.6±33.64µs        ? ?/sec
nlike_utf8 scalar complex                          1.00  1918.4±31.09µs        ? ?/sec                            1.00  1924.4±43.22µs        ? ?/sec
nlike_utf8 scalar contains                         1.00  1566.2±19.53µs        ? ?/sec                            1.00  1566.7±16.01µs        ? ?/sec
nlike_utf8 scalar ends with                        1.00    428.5±5.39µs        ? ?/sec                            1.02   438.5±14.26µs        ? ?/sec
nlike_utf8 scalar equals                           1.16    127.0±0.15µs        ? ?/sec                            1.00    109.5±0.12µs        ? ?/sec
nlike_utf8 scalar starts with                      1.07   359.3±21.26µs        ? ?/sec                            1.00    337.2±4.81µs        ? ?/sec

It looks like there are some non trivial regressions (50%) or utf8view with scalar starts/ends.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did run benchmarks locally too. Thanks @alamb for the instructions.

First, on the commit 1621350 where the PR branch starts

git checkout
16213501769371c7d2d6ce299755337b6db4d8fd
caffeinate -ds cargo bench --bench comparison_kernels -- --save-baseline main like

then on 1e82884, the tip of the PR branch

git checkout findepi/fix-string-view-like-checks-with-null-values-1121b8
caffeinate -ds cargo bench --bench comparison_kernels -- --baseline main like

This command runs bechmark and produces comparison. Let me omit low changes (below 9%) for brevity, as well as outliers reporting.

significant performance regressions:

like_utf8view scalar ends with 4 bytes
                        time:   [36.885 ms 36.925 ms 36.972 ms]
                        change: [+77.568% +77.857% +78.161%] (p = 0.00 < 0.05)
                        Performance has regressed.

like_utf8view scalar ends with 6 bytes
                        time:   [36.761 ms 36.818 ms 36.881 ms]
                        change: [+77.369% +77.776% +78.195%] (p = 0.00 < 0.05)
                        Performance has regressed.

like_utf8view scalar ends with 13 bytes
                        time:   [36.537 ms 36.613 ms 36.694 ms]
                        change: [+78.308% +79.019% +79.752%] (p = 0.00 < 0.05)
                        Performance has regressed.

like_utf8view scalar starts with 4 bytes
                        time:   [25.490 ms 25.542 ms 25.596 ms]
                        change: [+112.76% +113.25% +113.76%] (p = 0.00 < 0.05)
                        Performance has regressed.

like_utf8view scalar starts with 6 bytes
                        time:   [36.556 ms 36.598 ms 36.645 ms]
                        change: [+84.943% +85.578% +86.205%] (p = 0.00 < 0.05)
                        Performance has regressed.

like_utf8view scalar starts with 13 bytes
                        time:   [37.065 ms 37.105 ms 37.152 ms]
                        change: [+86.735% +87.424% +88.116%] (p = 0.00 < 0.05)
                        Performance has regressed.

significant performance improvements:

ilike_utf8 scalar equals
                        time:   [475.56 µs 476.17 µs 476.93 µs]
                        change: [-47.498% -47.406% -47.310%] (p = 0.00 < 0.05)
                        Performance has improved.

ilike_utf8 scalar contains
                        time:   [2.4461 ms 2.4500 ms 2.4552 ms]
                        change: [-15.221% -15.008% -14.786%] (p = 0.00 < 0.05)
                        Performance has improved.

ilike_utf8 scalar ends with
                        time:   [546.82 µs 547.76 µs 548.86 µs]
                        change: [-43.736% -43.605% -43.466%] (p = 0.00 < 0.05)
                        Performance has improved.

ilike_utf8 scalar starts with
                        time:   [560.40 µs 561.46 µs 562.67 µs]
                        change: [-43.349% -42.997% -42.641%] (p = 0.00 < 0.05)
                        Performance has improved.

Benchmarking ilike_utf8 scalar complex: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.4s, enable flat sampling, or reduce sample count to 60.
ilike_utf8 scalar complex
                        time:   [1.0629 ms 1.0637 ms 1.0649 ms]
                        change: [-29.556% -29.299% -29.062%] (p = 0.00 < 0.05)
                        Performance has improved.

nilike_utf8 scalar equals
                        time:   [475.34 µs 476.23 µs 477.67 µs]
                        change: [-46.418% -46.259% -46.081%] (p = 0.00 < 0.05)
                        Performance has improved.

nilike_utf8 scalar contains
                        time:   [2.4427 ms 2.4449 ms 2.4477 ms]
                        change: [-15.899% -15.421% -15.120%] (p = 0.00 < 0.05)
                        Performance has improved.

nilike_utf8 scalar ends with
                        time:   [548.31 µs 548.96 µs 549.63 µs]
                        change: [-42.924% -42.742% -42.554%] (p = 0.00 < 0.05)
                        Performance has improved.

nilike_utf8 scalar starts with
                        time:   [557.88 µs 558.54 µs 559.35 µs]
                        change: [-42.683% -42.392% -42.144%] (p = 0.00 < 0.05)
                        Performance has improved.

Benchmarking nilike_utf8 scalar complex: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.4s, enable flat sampling, or reduce sample count to 60.
nilike_utf8 scalar complex
                        time:   [1.0648 ms 1.0669 ms 1.0696 ms]
                        change: [-28.834% -28.600% -28.353%] (p = 0.00 < 0.05)
                        Performance has improved.

TL;DR
i must be doing something wrong here. the benchmarks report improvement for code path i didn't change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the results are apparently repeatable on my laptop

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, easy to fix, actually

Copy link
Member Author

@findepi findepi Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed performance-oriented changes
i still observe a regression for one case

like_utf8view scalar starts with 4 bytes
                        time:   [12.899 ms 12.939 ms 12.986 ms]
                        change: [+7.7003% +8.0298% +8.4320%] (p = 0.00 < 0.05)
                        Performance has regressed.

and still see improvements for many ilike cases e.g.

nilike_utf8 scalar ends with
                        time:   [547.64 µs 548.55 µs 549.63 µs]
                        change: [-42.656% -42.257% -41.801%] (p = 0.00 < 0.05)
                        Performance has improved.

i don't think i know how to move it further from here. @alamb @tustvold ptal

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking it out

@findepi findepi force-pushed the findepi/fix-string-view-like-checks-with-null-values-1121b8 branch from d92d94a to 1e82884 Compare November 1, 2024 20:42
@findepi findepi requested a review from tustvold November 1, 2024 21:24
@findepi findepi requested a review from alamb November 3, 2024 20:37
@findepi
Copy link
Member Author

findepi commented Nov 8, 2024

cc @Dandandan @crepererum

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @findepi -- this looks good to me. I am also rerunning the benchmarks to be sure

@@ -146,6 +138,7 @@ impl<'a> Predicate<'a> {
}
Predicate::IStartsWithAscii(v) => {
if let Some(string_view_array) = array.as_any().downcast_ref::<StringViewArray>() {
// TODO respect null buffer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are good todos -- I will file a ticket to handle them

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i plan to take care of this -- for apache/datafusion#12637

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i created #6705 for this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@alamb
Copy link
Contributor

alamb commented Nov 8, 2024

Here are my benchmark results. My conclusion is that there is some non trivial variability in the benchmarks but I don't think this PR does anything substantial

group                                              findepi_fix-string-view-like-checks-with-null-values-1121b8    master
-----                                              -----------------------------------------------------------    ------
ilike_utf8 scalar complex                          1.03      2.8±0.07ms        ? ?/sec                            1.00      2.7±0.06ms        ? ?/sec
ilike_utf8 scalar contains                         1.04      4.3±0.06ms        ? ?/sec                            1.00      4.1±0.05ms        ? ?/sec
ilike_utf8 scalar ends with                        1.00  1255.7±49.78µs        ? ?/sec                            1.00  1250.3±33.44µs        ? ?/sec
ilike_utf8 scalar equals                           1.00   729.4±23.72µs        ? ?/sec                            1.07   781.1±24.73µs        ? ?/sec
ilike_utf8 scalar starts with                      1.01  1151.3±46.67µs        ? ?/sec                            1.00  1144.6±35.48µs        ? ?/sec
ilike_utf8_scalar_dyn dictionary[10] string[4])    1.00     77.9±0.30µs        ? ?/sec                            1.00     77.6±0.07µs        ? ?/sec
like_utf8 scalar complex                           1.02  1952.8±24.11µs        ? ?/sec                            1.00  1909.3±20.85µs        ? ?/sec
like_utf8 scalar contains                          1.10  1728.7±21.70µs        ? ?/sec                            1.00  1565.7±13.97µs        ? ?/sec
like_utf8 scalar ends with                         1.00    421.4±9.37µs        ? ?/sec                            1.01    424.1±7.08µs        ? ?/sec
like_utf8 scalar equals                            1.00    109.7±0.29µs        ? ?/sec                            1.16    127.1±0.33µs        ? ?/sec
like_utf8 scalar starts with                       1.00   345.5±12.23µs        ? ?/sec                            1.00   346.2±11.37µs        ? ?/sec
like_utf8_scalar_dyn dictionary[10] string[4])     1.00     77.6±0.22µs        ? ?/sec                            1.00     77.5±0.10µs        ? ?/sec
like_utf8view scalar complex                       1.06    195.4±0.68ms        ? ?/sec                            1.00    183.9±0.71ms        ? ?/sec
like_utf8view scalar contains                      1.07    148.5±0.33ms        ? ?/sec                            1.00    138.8±0.20ms        ? ?/sec
like_utf8view scalar ends with 13 bytes            1.00     43.0±0.30ms        ? ?/sec                            1.11     47.7±0.17ms        ? ?/sec
like_utf8view scalar ends with 4 bytes             1.00     43.7±0.30ms        ? ?/sec                            1.11     48.7±0.12ms        ? ?/sec
like_utf8view scalar ends with 6 bytes             1.00     43.5±0.29ms        ? ?/sec                            1.11     48.5±0.16ms        ? ?/sec
like_utf8view scalar equals                        1.00     33.6±0.20ms        ? ?/sec                            1.07     35.8±0.07ms        ? ?/sec
like_utf8view scalar starts with 13 bytes          1.00     46.6±0.28ms        ? ?/sec                            1.01     47.2±0.15ms        ? ?/sec
like_utf8view scalar starts with 4 bytes           1.00     33.2±0.11ms        ? ?/sec                            1.03     34.3±0.18ms        ? ?/sec
like_utf8view scalar starts with 6 bytes           1.00     47.1±0.24ms        ? ?/sec                            1.02     47.8±0.21ms        ? ?/sec
nilike_utf8 scalar complex                         1.06      2.9±0.07ms        ? ?/sec                            1.00      2.7±0.06ms        ? ?/sec
nilike_utf8 scalar contains                        1.04      4.3±0.07ms        ? ?/sec                            1.00      4.1±0.05ms        ? ?/sec
nilike_utf8 scalar ends with                       1.00  1228.8±29.01µs        ? ?/sec                            1.00  1229.4±21.51µs        ? ?/sec
nilike_utf8 scalar equals                          1.00   735.4±24.25µs        ? ?/sec                            1.05   773.3±12.76µs        ? ?/sec
nilike_utf8 scalar starts with                     1.05  1174.5±40.82µs        ? ?/sec                            1.00  1121.1±17.17µs        ? ?/sec
nlike_utf8 scalar complex                          1.02  1950.7±31.49µs        ? ?/sec                            1.00  1907.6±24.54µs        ? ?/sec
nlike_utf8 scalar contains                         1.10  1728.3±15.41µs        ? ?/sec                            1.00  1568.6±15.88µs        ? ?/sec
nlike_utf8 scalar ends with                        1.00    420.2±8.39µs        ? ?/sec                            1.00    421.9±7.39µs        ? ?/sec
nlike_utf8 scalar equals                           1.00    109.6±0.28µs        ? ?/sec                            1.16    126.9±0.13µs        ? ?/sec
nlike_utf8 scalar starts with                      1.00    344.2±6.52µs        ? ?/sec                            1.03   355.1±17.11µs        ? ?/sec

@alamb alamb merged commit 12ca3f0 into apache:master Nov 8, 2024
18 checks passed
@findepi findepi deleted the findepi/fix-string-view-like-checks-with-null-values-1121b8 branch November 8, 2024 17:31
findepi added a commit to sdf-labs/arrow-rs that referenced this pull request Nov 14, 2024
* Fix string view LIKE checks with NULL values

* Add TODO comments to operations yet to be fixed

* Use from_unary for contains with string view

from_unary takes care of nulls

* Fix performance

* fixup! Fix performance
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants