-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Override QueryIter::fold to port Query::for_each perf gains to select Iterator combinators #6773
Conversation
Oh, man, this hadn't happened yet? It came up on Discord back in Feb 2021; I guess I should have opened an issue. As an FYI, there's a chance that implementing |
Finally was able to address the performance issues, so I'm taking this out of draft. Benchmark results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great safety comments. I think the unsafe is absolutely worth it to close this gap. Agreed on the deprecation.
@InBetweenNames, I'd appreciate your review on this :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work, it's good to see this get improved upon. I especially like how the folding behavior was split into two separate functions for fold_table
and fold_archetype
-- it feels much cleaner this way.
Co-authored-by: JoJoJet <21144246+JoJoJet@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on board for this change and the code looks good to me. Finally deprecating for_each!
Yeah this does make sense to me. The #6161 PR can be adapted to use the new style. |
Co-authored-by: Joseph <21144246+JoJoJet@users.noreply.github.com>
@james7132, unless I hear otherwise, I'm going to merge this tomorrow :) |
As a final sanity check, the codegen of this PR seems to show no tangible difference from what is in |
Changelog needs updating btw. Also, this is the first I'm hearing of it, but using for_each() instead for |
@james7132 sorry, you lost the coin flip on the merge conflicts :( Merge it yourself when you're done <3 |
@JMS55 Roughly, But it's easy to come up with examples where the Query iterator is non-trivial and the work to be done is simple, and thus using |
3db7137
to
541f4c5
Compare
Objective
After #6547,
Query::for_each
has been capable of automatic vectorization on certain queries, which is seeing a notable (>50% CPU time improvements) for iteration. However,Query::for_each
isn't idiomatic Rust, and lacks the flexibility of iterator combinators.Ideally,
Query::iter
and friends should be able to achieve the same results. However, this does seem to blocked upstream (rust-lang/rust#104914) by Rust's loop optimizations.Solution
This is an intermediate solution and refactor. This moves the
Query::for_each
implementation onto theIterator::fold
implementation forQueryIter
instead. This should result in the same automatic vectorization optimization on allIterator
functions that internally use fold, includingIterator::for_each
,Iterator::count
, etc.With this, it should close the gap between the two completely. Internally, this PR changes
Query::for_each
to usequery.iter().for_each(..)
instead of the duplicated implementation.Separately, the duplicate implementations of internal iteration (i.e.
Query::par_for_each
) now use portions of the currentQuery::for_each
implementation factored out into their own functions.This also massively cleans up our internal fragmentation of internal iteration options, deduplicating the iteration code used in
for_each
andpar_iter().for_each()
.Changelog
Changed:
Query::for_each
,Query::for_each_mut
,Query::for_each
, andQuery::for_each_mut
have been moved toQueryIter
'sIterator::for_each
implementation, and still retains their performance improvements over normal iteration. These APIs are deprecated in 0.13 and will be removed in 0.14.