Open
Description
Split from #31908 (comment) and full write-up at https://jazco.dev/2024/01/10/golang-and-epoll/.
tl;dr is that a program on a 192 core machine with >2500 sockets and with >1k becoming ready at once results in huge costs in netpoll -> epoll_wait
(~65% of total CPU).
Most interesting is that sharding these connections across 8 processes seems to solve the problem, implying some kind of super-linear scaling.
That the profile shows the time spent in epoll_wait
itself, this may be a scalability problem in the kernel itself, but we may still be able to mitigate.
@ericvolp12, some questions if you don't mind answering:
- Which version of Go are you using? And which kernel version?
- Do you happen to have a reproducer for this problem that you could share? (Sounds like no?)
- On a similar note, do you have a
perf
profile of this problem that shows where the time in the kernel is spent? - The 128 event buffer size is mentioned several times, but it is not obvious to me that increasing this size would actually solve the problem. Did you try increasing the size and see improved results?
cc @golang/runtime
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Todo