-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mute CQEs of send/write to reduce wakeups #1264
Comments
Yep this is not a bad idea, we've bounced around ideas for this very thing in the past as well. Send is a good example - generally they complete inline (eg immediatley), but it's not guaranteed. And while you don't need an immediate notification for them, generally you do want to see one so that you know the data it sent can get reused. Hence I think what we'd need is something like a low priority completion, in the sense that it doesn't need to wakeup the task waiting, but it should be included in the "I'm waiting for this number of events" accounting. A quick work-around with the existing code may be to just discount the write/send in the |
Tossed out a suggestion for handling something like this. |
what if CQ is overflowing with now ignored CQEs and no wakeup worthy CQE has arrived? |
There are several conditions that would still cause it to wake, like a short send/write (or an error), and overflow would be another one. Didn't cover the overflow case, but that will be done too. Anything but a fully successful send with a normal CQE posting would wake things up, naturally. |
I suppose you want to put a backlog limit on ignorable events, but it will bring a new parameter to all existing wait_cqe variants. It might be a little confusing.
I am afraid inline is not enough, because the number of inline is more predictable. On the other hand, async success and zc notifications are much out of our control, especially when inflight CQEs outnumber potential read/recv CQEs incredibly. Therefore, even if inline success can be ignored, the CQ ring may still be flooded by infight CQEs from previous rounds. However, |
wait_timeout(nr)
is generally a good way to reduce wakeups from kernel, while CQEs of send/write can bring unnecessary "noise", especially from plenty of zero-copy. In essence, it is difficult to estimate when send/write will return, yet their CQEs are generally not latency sensitive. So I think a possible solution is to flagMUTE_SUCCESS
in the SQE, then its CQE will not be counted as wakeable.The text was updated successfully, but these errors were encountered: