-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reintroduce into_future
in .await
desugaring
#90737
Conversation
(rust-highfive has picked a reviewer for you, use r? to override) |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 60f29450eeca029679562a3a0914975e5d8303a4 with merge 63f6208b5cec7496fd972de8245a4ff9d4da9e8b... |
☀️ Try build successful - checks-actions |
Queued 63f6208b5cec7496fd972de8245a4ff9d4da9e8b with parent d608229, future comparison URL. |
Finished benchmarking commit (63f6208b5cec7496fd972de8245a4ff9d4da9e8b): comparison url. Summary: This change led to very large relevant regressions 😿 in compiler performance.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never |
Neat; we're doing a lot better than last time, where we had a 700% performance regression which basically made async/await unusable. This time around we're doing a lot better, and some amount of performance regression should probably be expected since we're actually doing more work. I went in and verified that the output assembly is the same for both examples. Even though we're doing more work, we don't seem to be missing out on optimizing any of the generated code. [ Personally I feel like this regression is small enough that we should prioritize landing this part of the How do other folks feel about this performance regression? Are there potentially any low-hanging performance optimizations we could apply to speed this up? If there are none, would folks be okay if we proposed to merge this sooner and optimize later? Footnotes
|
If I'm reading the detailed performance diff right, it looks like a lot of the extra time is being spent in I'm not sure what the usual guidelines for accepting a regression are, but I'd be in favor of taking this one. Most of the benchmarks showed little change. The biggest regression was in As far as ideas for reducing the regression, I think it'd be worth looking at the way generators get converted to futures. Maybe there's something that could be done that's symbiotic with this change and would lead to less work being done in total. |
Thanks for the PR, @eholk & @yoshuawuyts! This looks good to me. It would be nice to reduce the regression but it seems acceptable. The compiler just has to do more work so it is expected that compile time increase for the cases where this is exercised. I am not particularly familiar with lowering (especially wrt async) so it would be good to have another pair of eyes on this: |
assert_eq!(7181, std::mem::size_of_val(&mixed_sizes())); | ||
assert_eq!(3076, std::mem::size_of_val(&joined())); | ||
assert_eq!(3076, std::mem::size_of_val(&joined_with_noop())); | ||
assert_eq!(6157, std::mem::size_of_val(&mixed_sizes())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unexpected! Do you know why these sizes change?
It looks like an improvement but I'd like to understand what's going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We weren't sure either! The previous PR also shrunk the size of the generated futures, without really commenting on it either. It's particularly surprising because neither the numbers before or after this changed are aligned the way we expected them to.
We reasoned that this change would be alright since it doesn't regress, there is precedence for accepting this in #65244, and it doesn't lead to any other changes as far as we can tell. Even if we don't fully understand why the current results are what they are, or what is causing the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this is right, but one guess is it has something to do with the match <future> { ... }
becoming match InfoFuture::into_future(<future>) { ... }
. Maybe the into_future
call moves the future into the scrutinee position of the match, rather than borrowing from the outer scope?
Judging by the comment on this test, it seems like the first two changes aren't a big deal, since they are just a few bytes. The last change is surprising because it improves by exactly 1024. This means we were able to eliminate one of the BigFut
s from the closure. By my count we should only need 5, so I'm not sure why we have 6 copies now or why we had 7 copies before. I wouldn't be surprised to see 8 or 5, I guess, but 6 and 7 are both weird. Could we be smarter about laying out the closures? Maybe there's a move from one slot to another that's forced by the way we laid out the closures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting theory, if we're turning a borrow into a move that can definitely improve things! See #62321 for instance – if I recall correctly, borrowing a local in a generator causes the analysis to assume the local is always live from that point, preventing it from reclaiming the storage.
☔ The latest upstream changes (presumably #89580) made this pull request unmergeable. Please resolve the merge conflicts. |
This is a reintroduction of the remaining parts from rust-lang#65244 that have not been relanded yet. Issues rust-langGH-67644, rust-langGH-67982
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
You can r=me after comments, also happy to take another look.
Add a note about `IntoFuture` in error messages where T is not a future. Change await-into-future.rs to be a run-pass test.
Thanks for the suggestions. I believe they are all addressed, so feel free to take a look. I don't know if I have permission to do r=me, but I'll try it out to see. @bors r=tmandry |
This comment has been minimized.
This comment has been minimized.
@bors r=tmandry |
@eholk: 🔑 Insufficient privileges: Not in reviewers |
@bors r+ |
📌 Commit 3268bf1 has been approved by |
☀️ Test successful - checks-actions |
Finished benchmarking commit (532d2b1): comparison url. Summary: This change led to very large relevant regressions 😿 in compiler performance.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression |
@rustbot label: +perf-regression-triaged We saw similar regressions during the earlier try job and decided it they were acceptable. See this earlier comment for some details. The summary is that the regressions are only in |
This is a reintroduction of the remaining parts from #65244 that have not been relanded yet.
This isn't quite ready to merge yet. The last attempt was reverting due to performance regressions, so we need to make sure this does not introduce those issues again.
Issues #67644, #67982
/cc @yoshuawuyts