-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zebra: Batch netlink messages #2831
Conversation
Batch messages to netlink when possible by saving them in a 16k buffer until the buffer is full or a timer expires or we change some datastructures associated with the messages, at which point flush the buffer and proceed as normal. Also: * Clean up netlink_recvbuf() to not take a totally useless parameter * Increase netlink txbuf size to match kernel * Increase netlink rxbuf size to be 16mb for testing purposes Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
💚 Basic BGPD CI results: SUCCESS, 0 tests failedResults table
For details, please contact louberger |
Continuous Integration Result: FAILEDSee below for issues. This is a comment from an EXPERIMENTAL automated CI system. Get source and apply patch from patchwork: SuccessfulBuilding Stage: SuccessfulBasic Tests: FailedCentOS 7 rpm pkg check: Successful Topotest tests on Ubuntu 16.04 i386: FailedTopology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPOI386-4820/test Topology Tests failed for Topotest tests on Ubuntu 16.04 i386:
see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4820/artifact/TOPOI386/ErrorLog/log_topotests.txt Topology tests on Ubuntu 16.04 amd64: FailedTopology Test Results are at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-TOPOU1604-4820/test Topology Tests failed for Topology tests on Ubuntu 16.04 amd64:
see full log at https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4820/artifact/TOPOU1604/ErrorLog/log_topotests.txt Topology Tests memory analysis: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-4820/artifact/TOPOI386/MemoryLeaks/
|
Hmm. "Stupid" question: does this actually help with anything? The kernel processes netlink requests synchronously anyways; it's not like you can go do further processing while the kernel updates the routes... from what I remember it's the fastest to immediately read back the result too. (or has this changed in the kernel?) |
It should increase the throughput of system calls (write). sendmmsg() shall be combined with it too. Please confirm the benefits like any increase in the number of message rate. |
I was told by a kernel developer to do this because he felt it would be faster giving some of the internal mechanics of how routes are processed in the kernel. But Vincent is correct we greatly reduce the task switching done. |
I tested a full bgp feed installation without this patch and with this patch on an ancient linux box. Before:
After:
Testing methodology: Start FRR, wait for convergence, then run a I am actually surprised by how much faster this was. I will do more investigation, but this shows some promise. |
Still sendmsg() is used: instead of sendmmsg() http://man7.org/linux/man-pages/man2/sendmmsg.2.html Since we get some batches now, sendmmsg() would add even more throughput. |
Feedback on call:
|
I need to do the work outlined by @eqvinox above as well as add 1 more bit of code to better handle install failures in a asynchronous manner. |
Closing PR because we need to take a slightly different approach and the originator will be working on that now |
Batch messages to netlink when possible by saving them in a 16k buffer
until the buffer is full or a timer expires or we change some
datastructures associated with the messages, at which point flush the
buffer and proceed as normal.
Also:
Signed-off-by: Stephen Worley sworley@cumulusnetworks.com