-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High load due to ksoftirqd, growing iptables rules #3117
Comments
Can you attach an actual listing of the iptables rules? It's hard to troubleshoot via a screenshot. Since it's 70+mb, compressing the file before attaching it may be useful. Are you running anything else on this node that manages iptables rules? kube-proxy and flannel should be the only thing touching the rules; I suspect something is interfering with their ability to sync rules so they keep creating new ones. |
Here is the file: iptables.log.gz
fail2ban is installed. |
Hmm this appears to be 7mb, not 70mb but still - there are a lot of duplicates in the Can you try disabling fail2ban (ensuring that it does not start again on startup) and restart the node? If the duplicate entries don't come back without fail2ban running then I am guessing that it is doing something to the ruleset that's causing duplicate rules to be created. |
@brandond Hmm in my initial comment I've mentioned 7 dot 0 MB. Sorry for the misunderstanding... I've disabled fail2ban but the duplicates rules still increasing over time. We've this issue on all machines running k3s version v1.20.4+k3s1 but not on v1.19.8+k3s1. All machines are configured identically. Here is a "iptables -L | wc -l" stat: #1 Cluster #2 Cluster #3 Cluster #4 Cluster |
The code ensures that the cluster IP and node port rules are the first three in that chain; I'm not really sure how that could go awry unless something else is manipulating the rules. What Debian release are you running on these nodes? What does k3s/pkg/agent/netpol/network_policy_controller.go Lines 317 to 338 in 355fff3
|
Debian Buster I've used the code snippet above for a little test program. The problem is that
always returns false. The workaround for us is to periodically flush the iptable rules. |
I've been having the same issue where there's tons of duplicate iptables rules being created. I've had servers with up to 40.000 iptables rules created. Disabling the network policy controller (since I use Cilium as CNI, this isn't necessary for me) fixes it. All of my nodes are running v1.20.4+k3s1 |
@clrxbl what os distribution and iptables version? |
This node has 13549 iptables rules, the majority of them in the KUBE-ROUTER-INPUT chain.
All of my nodes run the same software versions. |
Would also like to say that I'm getting the exact same duplicate iptables rules created aswell.
|
Interesting, debian nftables seems to be the commonality then. I think that go-iptables issue is probably what we're running into. Disabling the network policy controller should be an acceptable workaround, assuming you don't need policy enforcement. |
I have been able to duplicate this on Debian Buster. There appears to be a bug in Debian's nftables package that prevents it from properly checking iptables rules; it seems to reorder the modules so that they cannot be checked for in the order originally input:
This works properly after running
Since this appears to be a bug in the |
In that case I do think there should be some sort of warning placed during K3s installation when iptables is pointing to the Debian nftables backend until it's resolved. |
Just validated that this works properly on Ubuntu 20.10:
|
@clrxbl actually it looks like it's not even a kernel thing - it's just a bug in the version of the nftables package that Debain is shipping. If you
|
Putting iptables in legacy mode does not resolve the under laying issue with nftables for us. Rules are apparently not duplicated...
... but the output of nft tells us something different:
|
Had the same issue under Debian 10. Switched to legacy iptables, but did not help. |
You might try uninstalling the debian iptables/nftables packages, rather than just switching to legacy mode. |
@brandond thanks for investigating the issue. Do you have a link to more information on the nftables bug? Ideally we can push for this to be patched so this triage is not needed |
I haven't gotten as far as tracking it down to a specific commit in the upstream packages that fixed it, I just know that |
@kannanvr thank you for you tip, but we are not using kube router. We think however that we found the cause of our ongoing problems. After some additional digging we realized, that die duplicate rules are not a result of a call of the iptables command. We tried to find out what else is manipulating the iptables rules. Besides the iptables command itself, there are some other iptables commands that were still pointing to the 1.8.4 version of
After that change, we did not get any new duplicates. This implies, that one of the other commands is causing the bug in EL 8. |
Side note: According to Additional iptables-nft 1.8.0-1.8.3 compatibility problems iptables version 1.8.0 to 1.8.3 have known problems and 1.8.4 should be fine.
However, our tests on Rocky Linux 8.6 indicate that the 1.8.4 still has (another) issue in one of its commands. |
I had the same problem. I tried to switch the version of iptables.Because I have two local versions of iptables Selection Path Priority Status0 /usr/sbin/iptables-nft 20 auto mode
systemctl restart k3s |
@firefly-serenity we've been struggling with this problem for a while now, I'll try setting more alternatives to see if this helps, thanks. Just one thing I did want to mention, in case it sheds some light. We are running about 8 separate k3s clusters, all on Centos8-stream. All clusters except one are virtual machines - the last cluster having physical worker nodes (with virtual master). It is the physical cluster that has this issue (k3s-1.24.4, but also earlier versions) - every once in a while high load, iptables creating duplicate rules forcing us to reboot a node - never seen this problem on the VM based clusters. All nodes are installed/patched the same way regardless physical or virtual. |
@mogoman FWIW: The k3s system where @firefly-serenity and I see/saw this issue is running on six virtual nodes (Rocky Linux 8.6). (Since the last (extended) update-alternatives change it is working fine - so far.) |
Yes, official advice if and how bundled and distro iptables can coexist should be part of the first suggestion. It would help if the iptables (distro or bundle) used during the first k3s run would be given precedence in the PATH for later runs also. Prepending the bundled iptables to the PATH in k3s.service has been working for us on RHEL8 now for some time. The number of other bundled binaries under that PATH that could interfere with what the distro provides is really quite limited. |
@knweiss thanks. In between I've seen the problem on virtual nodes now too. The swapping out iptables solution is still holding and I've now rolled out to all clusters now (all running CentOS Stream 8). |
@dereknola does your new flag resolve this issue completely? |
Yes, the new flag should allow users who are stuck with buggy versions of iptables to work around the issue. We still need some docs for this though. |
Validated on k3s version v1.26.0-rc2+k3s1
Without prefer-bundled-bin uses PATH has OS path first followed by k3s bundle
|
On k3s version v1.25.5-rc3+k3s1
|
Replicated issue on Debian 10 and validated using the flag new duplicates were not added
With the flag to use k3s bundle
|
Environmental Info:
K3s Version: k3s version v1.20.4+k3s1 (838a906)
go version go1.15.8
Node(s) CPU architecture, OS, and Version: Linux 4.19.0-14-amd64 #1 SMP Debian 4.19.171-2 (2021-01-30) x86_64 GNU/Linux
Cluster Configuration: 1 master, 2 workers
Describe the bug: After some time we get high loads one the machine due to high soft irqs:
Output of perf report:
Something goes wrong with the iptables rules:
iptables -L produces 7.0 MB of rules (increasing more and more over time):
Steps To Reproduce:
The text was updated successfully, but these errors were encountered: