-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add workers, isBlocking and queue size to listen config #208
Conversation
Hi @mieczkowski , |
I may refactor a bit this PR, I think adding configuration options like workers is good but I don't want the program to take too many assumptions |
Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz, linux debian, 8 cores. Real flows from some production environments. 20-25k flows per second. ┌────────────datetime─┬─count()─┐
│ 2023-08-26 17:00:09 │ 21774 │
│ 2023-08-26 17:00:08 │ 24071 │
│ 2023-08-26 17:00:10 │ 20860 │
│ 2023-08-26 17:00:04 │ 21283 │
│ 2023-08-26 17:00:03 │ 19179 │
│ 2023-08-26 17:00:02 │ 20249 │
│ 2023-08-26 17:00:05 │ 20164 │
│ 2023-08-26 17:00:01 │ 20832 │
│ 2023-08-26 17:00:06 │ 20018 │
│ 2023-08-26 17:00:07 │ 19880 │
│ 2023-08-26 17:00:00 │ 22058 │
└─────────────────────┴─────────┘
pmacct is aggregating flows, so for example if in one second there will be 5 flows with identical params, pmacct will group them and put
I have nothing against :) Maybe good idea would be some "dynamic workers" mechanism. If dispatch channel is growing - run more workers, and clean them if traffic is falling. |
It's a good idea but I'm not sure if the added complexity is worth it: flows will have spikes periodically (unless ddos) and I don't think idle workers consume that much? since this is only a single "task" (decoding flows) and it will have to compete against other applications without being able to kick them out. I would rather suggest fixed capacity allocation. Overall I think it's more for platforms like Kubernetes to guarantee a minimum resource, autoscale or reschedule elsewhere. |
I would also suggest moving initialization of metrics to separate package. When I wanted to add metric for dropping packets (in udp.go) there was a import cycle (metrics import utils). |
there is https://github.com/netsampler/goflow2/blob/main/metrics/metrics.go |
But there is an import of utils package: https://github.com/netsampler/goflow2/blob/main/metrics/decoder.go#L9 |
Correct, I wanted to separate metrics from processing/decoding: for instance, if someone imports the package, it won't impose Prometheus metrics. |
I was testing flows collecting from pmacct and goflow2 and realized that goflow2 is loosing packets (goflow2 should have MORE data, because pmacct is aggregating in memory and flush aggregated)
After debugging I realized that when we have
1 socket = 1 worker
-> with high volume of flows per second reading packet will be faster than processing (and dispatch channel will be growing), so I propose:udp.go
increase default workers count tocfg.Sockets * 2
After changes: