-
Notifications
You must be signed in to change notification settings - Fork 1.5k
bug: remove busy-wait while sort is ongoing #16322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
6334f47
to
f7405ee
Compare
A sort preserving merge specific test case started failing. I’ll dig deeper to better understand what’s going on. |
@berkaysynnada I hope it's ok that I ping you directly; reaching out because I believe you are the author of the test case in question. I believe this PR surfaced a mistake in the
I've fixed the test implementation which makes the test pass again, but would like to get your opinion on this. Maybe I'm missing something. |
ad50fbe
to
8503407
Compare
This seems really nice 🚀
|
@Dandandan project newbie question, my daily practice at work is to handle code review comments using amend/force-push. Did so out of habit before thinking to as ask. Is that ok in this project or does the community prefer fixup commits and squash before merging? |
IMO it doesn't matter too much (personally I like to create individual commits). We'll squash-merge the commits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀 maybe @alamb or @berkaysynnada can have another look
🤖 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @pepijnve and @Dandandan -- this makes sense to me too. I took a careful look at the code and I found it to be easy to follow and clear ❤️
I think we should wait a few days before mergning this to see if others would like to review. I think @jayzhan211 and @ozankabak may also be interested
// If the polling result is Poll::Ready(Some(batch)) or Poll::Ready(None), | ||
// we remove this partition from the queue so it is not polled again. | ||
self.uninitiated_partitions.pop_front(); | ||
// The polled stream is ready |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for these comments 👍
🤖: Benchmark completed Details
|
Since this changes the congestion behavior test, which I'm not deeply familiar with, let's hear from @berkaysynnada on this to make sure we are not losing anything (he will be back tomorrow, so it shouldn't be long). If he is OK with the change, 10% performance improvement would be great. |
Which issue does this PR close?
Rationale for this change
SortPreservingMergeStream
works in two phases. It first waits for each input stream to be ready to emit. Once everybody's ready it proceeds to an emit phase.During the waiting phase, it will poll each stream in a round-robin fashion. If any stream returns
Pending
the code self-wakes the current task and immediately returnsPending
. This results in busy-waiting when waiting for, for instance, aSortExec
that's sorting its data or any other pipeline breaker.While this works, it wastes CPU cycles.
What changes are included in this PR?
Rather than returning immediately when one stream is pending, poll each stream once. Only return pending when there are still streams left that have not started emitting. This assumes that the pending streams are well behaved and will wake the task when they need to be polled again as required by the
Stream
contract. Note that this may surface bugs in other streams.Rotation of
uninitiated_partitions
has been removed since that's no longer useful. There was a comment in the code about 'upstream buffer size increase', but I'm not sure what that was referring to.Are these changes tested?
Only by existing test and manual testing
Are there any user-facing changes?
No