improve worst-case performance of BTreeSet intersection v1 #58577

ssomers · 2019-02-19T15:25:14Z

Major performance boost when comparing tiny and huge sets. Probably with controversial changes and I sure have questions:

Names and places of functions and types
How many comments to write where
Why does rustc tell me to ref mut and ref matches on the iterator, while the book says ref is old school: fixed in code, should take it up to compiler people.
(Why) do I have to write out the clone like that (#[derive(Clone)] doesn't work): see improve worst-case performance of BTreeSet intersection v3 #59186 (comment)_
Am I allowed to use #[derive(Debug)] there at all? Yes, but the actual Debug.fmt implementation should supersede it.
I'd like to test function are_proportionate_for_intersection in test_intersection (or another test case next to it) itself, but I think the private function is inaccessible there. liballoc has other test cases not in the tests directory.

PS I don't list these questions to start a discussion here, just to inspire reviewers and to remember myself.

rust-highfive · 2019-02-19T15:28:02Z

r? @KodrAus

(rust_highfive has picked a reviewer for you, use r? to override)

src/liballoc/collections/btree/set.rs

…section.next

KodrAus · 2019-02-21T02:09:37Z

Thanks @ssomers!

Would you be able to put together some benchmarks so we could get an idea of what the impact is for intersecting variously sized sets?

ssomers · 2019-02-21T09:12:18Z

put together some benchmarks so we could get an idea of what the impact is for intersecting variously sized sets?

I'm not sure how to contrast the various implementations in the same build. But I could first PR a benchmark in src\liballoc\benches\btree using the existing implementation, then merge this here and anyone looking closely enough would see times improve.

…ction

ssomers · 2019-02-21T22:40:00Z

Rough numbers on my system:

bset::tree ttest case	ns/iter before this PR	ns/iter after this PR
intersect_neg_vs_pos_100	500	=
intersect_neg_vs_pos_10_vs_10k	60	150
intersect_neg_vs_pos_10k	50,000	=
intersect_neg_vs_pos_10k_vs_10	50,000	400
intersect_pos_vs_neg_100	400	=
intersect_pos_vs_neg_10_vs_10k	40,000	150
intersect_pos_vs_neg_10k	40,000	=
intersect_pos_vs_neg_10k_vs_10	60	400
intersect_random_100	700	=
intersect_random_10_vs_10k	40,000	300
intersect_random_10k	200,000	=
intersect_random_10k_vs_10	50,000	300
intersect_staggered_100	700	=
intersect_staggered_10_vs_10k	80	150
intersect_staggered_10k	70,000	=
intersect_staggered_10k_vs_10	80	150

As you see, intersection of very asymmetric set (10 versus 10,000 elements) is mostly up, except if the set containing only numbers lower than any in the other set (neg = negative ints vs. pos=positive ints) is also a small set (i.e. the algorithm merely iterates over that small set and finishes).

The latter could be improved upon by limiting both sets to the range started with the smallest element of the other set. The whole algorithm could work like that: instead of iterating, find the next element >= the other set's lowest unseen value. But then sometimes/often the binary search towards the next element would be worse than iterating, in particular for the staggered test case.

KodrAus · 2019-02-21T23:12:09Z

Thanks for putting the numbers together @ssomers. That's much appreciated.

It looks like we aren't regressing in what I'd expect to be the common cases and smoothing out those worst cases is a really nice improvement.

@bors r+

bors · 2019-02-21T23:12:10Z

📌 Commit 4fd0cc1 has been approved by KodrAus

bors · 2019-02-21T23:12:11Z

🌲 The tree is currently closed for pull requests below priority 99, this pull request will be tested once the tree is reopened

…drAus improve worst-case performance of BTreeSet intersection Major performance boost when comparing tiny and huge sets. Probably with controversial changes and I sure have questions: - Names and places of functions and types - How many comments to write where - Why does rustc tell me to `ref mut` and `ref` matches on the iterator, while the book says ref is old school. - (Why) do I have to write out the clone like that (`#[derive(Clone)]` doesn't work) - Am I allowed to use `#[derive(Debug)]` there at all? - I'd like to test function `are_proportionate_for_intersection` in test_intersection (or another test case next to it) itself, but I think the private function is inaccessible there. liballoc has other test cases not in the tests directory. PS I don't list these questions to start a discussion here, just to inspire reviewers and to remember myself.

…marks, r=KodrAus introduce benchmarks of BTreeSet.intersection 16 tests combining 4 kinds of contents with different sizes exposing edge cases. The ones with asymmetric sizes are addressed by rust-lang#58577. The pos_vs_neg cases seems (are were meant to be) the same as the neg_vs_pos case (same thing, reverse order) but reality shows a surprsing 25% difference.

…inimum

ssomers · 2019-02-24T00:08:39Z

With the last commit added on top, rough numbers on my system are almost all down:

bset::tree ttest case	ns/iter before this PR	ns/iter after this PR
intersect_neg_vs_pos_100	500	60
intersect_neg_vs_pos_10_vs_10k	60	50
intersect_neg_vs_pos_10k	50,000	80
intersect_neg_vs_pos_10k_vs_10	50,000	400
intersect_pos_vs_neg_100	400	60
intersect_pos_vs_neg_10_vs_10k	40,000	50
intersect_pos_vs_neg_10k	40,000	90
intersect_pos_vs_neg_10k_vs_10	60	400
intersect_random_100	700	600
intersect_random_10_vs_10k	40,000	300
intersect_random_10k	200,000	170,000
intersect_random_10k_vs_10	50,000	300
intersect_staggered_100	700	=
intersect_staggered_10_vs_10k	80	170
intersect_staggered_10k	70,000	=
intersect_staggered_10k_vs_10	80	180

As to taking this further with "instead of iterating, find the next element >= the other set's lowest unseen value", no cigar: it slows down intersect_random_10k 2.5 times, and (absolute worst case) intersect_staggered_10k 14x.

…strategy

ssomers · 2019-02-25T23:24:49Z

More tweaks:

expose both intersection strategies as an unstable feature
benchmark both at the same time, along with the actual strategy chosen
seriously lower the threshold for the new strategy while still erring on the conservative side
quantify are_proportionate_for_intersection with static assertions instead of unit test

No more ideas at the moment so that's going to be the last batch, unless there are remarks.

rust-highfive · 2019-02-25T23:39:07Z

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

travis_time:end:063bf03b:start=1551137061283394269,finish=1551137273491208861,duration=212207814592
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-6.0
---

[00:03:32] travis_fold:start:tidy
travis_time:start:tidy
tidy check
[00:03:33] tidy error: /checkout/src/liballoc/benches/btree/set.rs:99: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/benches/btree/set.rs:100: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/benches/btree/set.rs:102: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/benches/btree/set.rs:103: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/benches/btree/set.rs:105: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/benches/btree/set.rs:106: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/benches/btree/set.rs:108: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/benches/btree/set.rs:109: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/benches/btree/set.rs:111: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/benches/btree/set.rs:112: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/benches/btree/set.rs:114: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/benches/btree/set.rs:115: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/collections/btree/set.rs:373: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/collections/btree/set.rs:374: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/collections/btree/set.rs:389: line longer than 100 chars
[00:03:33] tidy error: /checkout/src/liballoc/collections/btree/set.rs:390: line longer than 100 chars
[00:03:34] some tidy checks failed
[00:03:34] 
[00:03:34] 
[00:03:34] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/tidy" "/checkout/src" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "--no-vendor" "--quiet"
[00:03:34] 
[00:03:34] 
[00:03:34] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
[00:03:34] Build completed unsuccessfully in 0:00:46
[00:03:34] Build completed unsuccessfully in 0:00:46
[00:03:34] make: *** [tidy] Error 1
[00:03:34] Makefile:68: recipe for target 'tidy' failed
The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:1fa8d203
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
Mon Feb 25 23:31:37 UTC 2019
---
travis_time:end:30e2118a:start=1551137498538054658,finish=1551137498542746470,duration=4691812
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:0918cf09
$ ln -s . checkout && for CORE in obj/cores/core.*; do EXE=$(echo $CORE | sed 's|obj/cores/core\.[0-9]*\.!checkout!\(.*\)|\1|;y|!|/|'); if [ -f "$EXE" ]; then printf travis_fold":start:crashlog\n\033[31;1m%s\033[0m\n" "$CORE"; gdb --batch -q -c "$CORE" "$EXE" -iex 'set auto-load off' -iex 'dir src/' -iex 'set sysroot .' -ex bt -ex q; echo travis_fold":"end:crashlog; fi; done || true
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:164e2594
travis_time:start:164e2594
$ cat ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
cat: ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers: No such file or directory
travis_fold:end:after_failure.5
travis_fold:start:after_failure.6
travis_time:start:0ed429a8
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

rust-highfive · 2019-02-26T11:48:31Z

Your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.

travis_time:end:0082f334:start=1551175199493104002,finish=1551175272346776090,duration=72853672088
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=mingw-check
---
Attempting to download s3://rust-lang-ci-sccache2/docker/b5493eee5f15270054538ad6f259fd35cdcde8a43e617233529277e394d7185f136e6026d0c390061a3c94b487e75625efa873039c2128199e11805e10d47851
[00:00:15] Attempting with retry: curl -y 30 -Y 10 --connect-timeout 30 -f -L -C - -o /tmp/rustci_docker_cache https://s3-us-west-1.amazonaws.com/rust-lang-ci-sccache2/docker/b5493eee5f15270054538ad6f259fd35cdcde8a43e617233529277e394d7185f136e6026d0c390061a3c94b487e75625efa873039c2128199e11805e10d47851
[00:00:15]   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
[00:00:15]                                  Dload  Upload   Total   Spent    Left  Speed
No output has been received in the last 30m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
The build has been terminated

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

…nction because its outcome is reported by the benchmarks

…tion

…red output; add a few more

…intersection

improve worst-case performance of BTreeSet intersection Alternative to [pull request #58577](#58577): back out of attempts to optimize using ranges, more elegant code (I think). The stable public type Intersection changes from struct to enum. If that matters, then perhaps changing the fields like in the other proposal also mattered.

improve worst-case performance of BTreeSet intersection

dae149c

rust-highfive assigned KodrAus Feb 19, 2019

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 19, 2019

Centril reviewed Feb 19, 2019

View reviewed changes

src/liballoc/collections/btree/set.rs Outdated Show resolved Hide resolved

remove the #[stable] thingy

fab1fa9

Centril reviewed Feb 19, 2019

View reviewed changes

src/liballoc/collections/btree/set.rs Outdated Show resolved Hide resolved

remove another #[stable] thingy and superfluous comma's

f536f7a

varkor reviewed Feb 19, 2019

View reviewed changes

src/liballoc/collections/btree/set.rs Outdated Show resolved Hide resolved

renamed enum variants, moved struct declaration down, condensed Inter…

0186d90

…section.next

ssomers mentioned this pull request Feb 21, 2019

introduce benchmarks of BTreeSet.intersection #58620

Merged

Merge branch 'btreeset_intersection_benchmarks' into btreeset_interse…

4fd0cc1

…ction

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 21, 2019

Centril mentioned this pull request Feb 22, 2019

Rollup of 18 pull requests #58637

Closed

get rid of the refs suggested by rustc

6553b0d

speed up BTreeSet intersection: skip elements below the other set's m…

077ecd4

…inimum

BTreeSet intersection: lower and benchmark the threshold for the new …

f35c950

…strategy

tidy up long lines

e68dc44

ssomers added 6 commits February 26, 2019 15:22

BTreeSet intersection: simple optimization for singleton; dissolve fu…

5559d11

…nction because its outcome is reported by the benchmarks

Merge remote-tracking branch 'upstream/master' into btreeset_intersec…

d07cb7f

…tion

BTreeSet intersection: adapt to recent changes merged in

a0d8f24

BTreeSet benchmarks: rename and reorder them to match with ascii-orde…

0d12ecf

…red output; add a few more

BTreeSet intersection: simplify range calculation; swap order again

8b57fa8

Merge remote-tracking branch 'remotes/upstream/master' into btreeset_…

d825e02

…intersection

ssomers changed the title ~~improve worst-case performance of BTreeSet intersection~~ improve worst-case performance of BTreeSet intersection more Mar 9, 2019

fix benchmark delivering fake news

3e90b15

ssomers mentioned this pull request Mar 10, 2019

improve worst-case performance of BTreeSet intersection v2 #59078

Closed

ssomers changed the title ~~improve worst-case performance of BTreeSet intersection more~~ improve worst-case performance of BTreeSet intersection v1 Mar 14, 2019

KodrAus mentioned this pull request Mar 19, 2019

improve worst-case performance of BTreeSet intersection v3 #59186

Merged

Merge branch 'master' into btreeset_intersection

9099047

ssomers closed this Mar 21, 2019

ssomers deleted the btreeset_intersection branch April 6, 2019 08:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

improve worst-case performance of BTreeSet intersection v1 #58577

improve worst-case performance of BTreeSet intersection v1 #58577

Uh oh!

ssomers commented Feb 19, 2019 •

edited

Loading

Uh oh!

rust-highfive commented Feb 19, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KodrAus commented Feb 21, 2019

Uh oh!

ssomers commented Feb 21, 2019

Uh oh!

ssomers commented Feb 21, 2019

Uh oh!

KodrAus commented Feb 21, 2019

Uh oh!

bors commented Feb 21, 2019

Uh oh!

bors commented Feb 21, 2019

Uh oh!

ssomers commented Feb 24, 2019 •

edited

Loading

Uh oh!

ssomers commented Feb 25, 2019

Uh oh!

rust-highfive commented Feb 25, 2019

Uh oh!

rust-highfive commented Feb 26, 2019

Uh oh!

Uh oh!

improve worst-case performance of BTreeSet intersection v1 #58577

improve worst-case performance of BTreeSet intersection v1 #58577

Uh oh!

Conversation

ssomers commented Feb 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rust-highfive commented Feb 19, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KodrAus commented Feb 21, 2019

Uh oh!

ssomers commented Feb 21, 2019

Uh oh!

ssomers commented Feb 21, 2019

Uh oh!

KodrAus commented Feb 21, 2019

Uh oh!

bors commented Feb 21, 2019

Uh oh!

bors commented Feb 21, 2019

Uh oh!

ssomers commented Feb 24, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ssomers commented Feb 25, 2019

Uh oh!

rust-highfive commented Feb 25, 2019

Uh oh!

rust-highfive commented Feb 26, 2019

Uh oh!

Uh oh!

ssomers commented Feb 19, 2019 •

edited

Loading

ssomers commented Feb 24, 2019 •

edited

Loading