-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TCP rejection can't work on Kind when the traffic mode is noEncap #2025
Comments
To be clear, it's a similar situation as https://ask.openstack.org/en/question/28300/iptables-invalid-rule-preventing-rst-packets-on-closed-ports-between-vms/, but not quite the same. The TCP RST packet does go twice through conntrack for some reason, which seems related to the fact that we use an extra netdev bridge (br-phy) attached to a physical interface (eth0). I have confirmed this by adding the following iptables rule:
Notice how that rule was hit twice despite the fact that there was a single TCP RST packet. So that packet must be going twice through PREROUTING and conntrack. The conntrack entry is destroyed the first time, which causes the packet to be dropped the second time because it is now invalid. I read something related in the OVS documentation:
These rules don't help with our situation (the packet still goes through conntrack twice). However the following rule did help:
After that the TCP reset can make its way back to the source Pod as expected. @GraysonWu It may be interesting to add this rule as part of https://github.com/vmware-tanzu/antrea/blob/main/build/images/scripts/start_ovs_netdev. Then we should be able to run the test instead of skipping it. It may help with other networking issues in Kind, who knows... We can look into this after #2001 is merged, there is no rush. |
Thanks @antoninbas for adding these helpful details. Yeah, we could try that later. |
According to the OVS documentation: On Linux, when a physical interface is in use by the userspace datapath, packets received on the interface still also pass into the kernel TCP/IP stack. This can cause surprising and incorrect behavior. You can use "iptables" to avoid this behavior, by using it to drop received packets. The OVS documentation suggests dropping packets in the INPUT and FORWARD chains. However, this is not sufficient for some edge cases. For example, when receiving a TCP RST packet, the packet will clear the conntrack entry for the TCP connection before it can be dropped, which can cause the "second" TCP RST packet (the one processed by OVS userspace) to be marked as invalid when going through conntrack. So instead we drop the packet in PREROUTING: iptables -t raw -A PREROUTING -i eth0 -j DROP This rule is added to the start_ovs_netdev script. By adding this rule, we no longer need to skip TCP e2e tests for the Reject NetworkPolicy Action in Kind clusters. It's possible that this is also going to help with various connectivity issues we observed with Antrea in Kind over time. For example, I believe we may also be able to remove the hack which reduces the value of the tcp_retries2 sysctl parameter. I need to run tests to confirm. Fixes antrea-io#2025
According to the OVS documentation: On Linux, when a physical interface is in use by the userspace datapath, packets received on the interface still also pass into the kernel TCP/IP stack. This can cause surprising and incorrect behavior. You can use "iptables" to avoid this behavior, by using it to drop received packets. The OVS documentation suggests dropping packets in the INPUT and FORWARD chains. However, this is not sufficient for some edge cases. For example, when receiving a TCP RST packet, the packet will clear the conntrack entry for the TCP connection before it can be dropped, which can cause the "second" TCP RST packet (the one processed by OVS userspace) to be marked as invalid when going through conntrack. So instead we drop the packet in PREROUTING: iptables -t raw -A PREROUTING -i eth0 -j DROP This rule is added to the start_ovs_netdev script. By adding this rule, we no longer need to skip TCP e2e tests for the Reject NetworkPolicy Action in Kind clusters. It's possible that this is also going to help with various connectivity issues we observed with Antrea in Kind over time. For example, I believe we may also be able to remove the hack which reduces the value of the tcp_retries2 sysctl parameter. I need to run tests to confirm. Fixes antrea-io#2025
According to the OVS documentation: On Linux, when a physical interface is in use by the userspace datapath, packets received on the interface still also pass into the kernel TCP/IP stack. This can cause surprising and incorrect behavior. You can use "iptables" to avoid this behavior, by using it to drop received packets. The OVS documentation suggests dropping packets in the INPUT and FORWARD chains. However, this is not sufficient for some edge cases. For example, when receiving a TCP RST packet, the packet will clear the conntrack entry for the TCP connection before it can be dropped, which can cause the "second" TCP RST packet (the one processed by OVS userspace) to be marked as invalid when going through conntrack. So instead we drop the packet in PREROUTING: iptables -t raw -A PREROUTING -i eth0 -j DROP This rule is added to the start_ovs_netdev script. By adding this rule, we no longer need to skip TCP e2e tests for the Reject NetworkPolicy Action in Kind clusters. It's possible that this is also going to help with various connectivity issues we observed with Antrea in Kind over time. For example, I believe we may also be able to remove the hack which reduces the value of the tcp_retries2 sysctl parameter. I need to run tests to confirm. Fixes antrea-io#2025 Signed-off-by: Antonin Bas <abas@vmware.com>
According to the OVS documentation: On Linux, when a physical interface is in use by the userspace datapath, packets received on the interface still also pass into the kernel TCP/IP stack. This can cause surprising and incorrect behavior. You can use "iptables" to avoid this behavior, by using it to drop received packets. The OVS documentation suggests dropping packets in the INPUT and FORWARD chains. However, this is not sufficient for some edge cases. For example, when receiving a TCP RST packet, the packet will clear the conntrack entry for the TCP connection before it can be dropped, which can cause the "second" TCP RST packet (the one processed by OVS userspace) to be marked as invalid when going through conntrack. So instead we drop the packet in PREROUTING: iptables -t raw -A PREROUTING -i eth0 -j DROP This rule is added to the start_ovs_netdev script. By adding this rule, we no longer need to skip TCP e2e tests for the Reject NetworkPolicy Action in Kind clusters. It's possible that this is also going to help with various connectivity issues we observed with Antrea in Kind over time. For example, I believe we may also be able to remove the hack which reduces the value of the tcp_retries2 sysctl parameter. I need to run tests to confirm. Fixes antrea-io#2025 Signed-off-by: Antonin Bas <abas@vmware.com>
According to the OVS documentation: On Linux, when a physical interface is in use by the userspace datapath, packets received on the interface still also pass into the kernel TCP/IP stack. This can cause surprising and incorrect behavior. You can use "iptables" to avoid this behavior, by using it to drop received packets. The OVS documentation suggests dropping packets in the INPUT and FORWARD chains. However, this is not sufficient for some edge cases. For example, when receiving a TCP RST packet, the packet will clear the conntrack entry for the TCP connection before it can be dropped, which can cause the "second" TCP RST packet (the one processed by OVS userspace) to be marked as invalid when going through conntrack. So instead we drop the packet in PREROUTING: iptables -t raw -A PREROUTING -i eth0 -j DROP This rule is added to the start_ovs_netdev script. By adding this rule, we no longer need to skip TCP e2e tests for the Reject NetworkPolicy Action in Kind clusters. It's possible that this is also going to help with various connectivity issues we observed with Antrea in Kind over time. For example, I believe we may also be able to remove the hack which reduces the value of the tcp_retries2 sysctl parameter. I need to run tests to confirm. Fixes antrea-io#2025 Signed-off-by: Antonin Bas <abas@vmware.com>
According to the OVS documentation: On Linux, when a physical interface is in use by the userspace datapath, packets received on the interface still also pass into the kernel TCP/IP stack. This can cause surprising and incorrect behavior. You can use "iptables" to avoid this behavior, by using it to drop received packets. The OVS documentation suggests dropping packets in the INPUT and FORWARD chains. However, this is not sufficient for some edge cases. For example, when receiving a TCP RST packet, the packet will clear the conntrack entry for the TCP connection before it can be dropped, which can cause the "second" TCP RST packet (the one processed by OVS userspace) to be marked as invalid when going through conntrack. So instead we drop the packet in PREROUTING: iptables -t raw -A PREROUTING -i eth0 -j DROP This rule is added to the start_ovs_netdev script. By adding this rule, we no longer need to skip TCP e2e tests for the Reject NetworkPolicy Action in Kind clusters. It's possible that this is also going to help with various connectivity issues we observed with Antrea in Kind over time. For example, I believe we are also able to remove the hack which reduces the value of the tcp_retries2 sysctl parameter. Fixes #2025 Signed-off-by: Antonin Bas <abas@vmware.com>
Describe the problem/challenge you have
When exec E2E test, the test case
TestAntreaPolicy/TestGroupNoK8sNP/Case=ACNPRejectIngress
which is testing ifReject
can work on TCP traffic always fails when enabling AntreaPolicy in noEncap mode. Some connections that should be rejected are observed as dropped. But it doesn't fail when testing it on the local vagrant testbed.When manually test
Reject
on TCP traffic in a local Kind cluster in noEncap mode with AntreaPolicy enabling:According to the investigation, on Kind cluster, we use the OVS netdev datapath, which requires a bridge setup on each node. For the inter-node traffic case in noEncap mode, there will be two conntrack lookups for the same connection. When a TCP RST packet is sent out by the
Reject
action, the conntrack entry will be destroyed by the first lookup. Then the second lookup will tag the packet as INVALID and drop it. Since the response packet has been dropped by the Kernal, the Pod won't receive it. Thus the Pod will keep retrying and waiting until timed out with no reject response received.So skip this test for now when the provider is Kind the traffic mode is noEncap.
Thanks for @antoninbas 's help during the whole process.
Reference: https://ask.openstack.org/en/question/28300/iptables-invalid-rule-preventing-rst-packets-on-closed-ports-between-vms/
The text was updated successfully, but these errors were encountered: