Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up stale IP addresses on Antrea host gateway #1900

Merged
merged 3 commits into from
Feb 25, 2021

Conversation

antoninbas
Copy link
Contributor

When a Node leaves a K8s cluster and joins it again later, it is likely
that it will receive a different Pod CIDR. If we simply add the new
gateway IP (first IP in the CIDR) to the host gateway interface without
removing the previous one, we have observed that it can lead to
connectivity issues. This commit ensures that all stale IPs are removed
when configuring the gateway interface on an Agent restart. Link-local
addresses are left untouched. An integration test is added for both
Linux & Windows.

Fixes #1685

@antoninbas
Copy link
Contributor Author

Integration test has to be run manually on Windows:
image

tnqn
tnqn previously approved these changes Feb 25, 2021
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the comprehensive fix.

@antoninbas
Copy link
Contributor Author

/test-all

@antoninbas
Copy link
Contributor Author

/test-ipv6-all
/test-ipv6-only-all

@codecov-io
Copy link

codecov-io commented Feb 25, 2021

Codecov Report

❗ No coverage uploaded for pull request base (main@0b3304c). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1900   +/-   ##
=======================================
  Coverage        ?   53.04%           
=======================================
  Files           ?      201           
  Lines           ?    17284           
  Branches        ?        0           
=======================================
  Hits            ?     9169           
  Misses          ?     6953           
  Partials        ?     1162           
Flag Coverage Δ
e2e-tests 42.49% <0.00%> (?)
kind-e2e-tests 49.72% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

return addr1.IP.Equal(addr2.IP) && size1 == size2
}

addrSliceDifference := func(s1, s2 []*net.IPNet) []*net.IPNet {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you consider to move the function out to make it a util function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asked the same question in net_linux.go :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extracted the function definition. However, note that the signature is different for the Linux & Windows case. When go has support for generics, we can have a true common utility function to compute slice differences.

Copy link
Contributor

@ruicao93 ruicao93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Antonin for the PR. LGTM, just minor comments.

jianjuns
jianjuns previously approved these changes Feb 25, 2021
}
}

addrSliceDifference := func(s1, s2 []netlink.Addr) []*netlink.Addr {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it helps to move the func to an independent one?

return addr1.IP.Equal(addr2.IP) && size1 == size2
}

addrSliceDifference := func(s1, s2 []*net.IPNet) []*net.IPNet {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asked the same question in net_linux.go :)

When a Node leaves a K8s cluster and joins it again later, it is likely
that it will receive a different Pod CIDR. If we simply add the new
gateway IP (first IP in the CIDR) to the host gateway interface without
removing the previous one, we have observed that it can lead to
connectivity issues. This commit ensures that all stale IPs are removed
when configuring the gateway interface on an Agent restart. Link-local
addresses are left untouched. An integration test is added for both
Linux & Windows.

Fixes antrea-io#1685
@antoninbas
Copy link
Contributor Author

/test-all
/test-ipv6-all
/test-ipv6-only-all

@antoninbas
Copy link
Contributor Author

/test-all
/test-ipv6-all
/test-ipv6-only-all

@antoninbas
Copy link
Contributor Author

/test-ipv6-only-networkpolicy

1 similar comment
@antoninbas
Copy link
Contributor Author

/test-ipv6-only-networkpolicy

@antoninbas
Copy link
Contributor Author

jenkins-ipv6-only-networkpolicy is failing because of a testbed issue. I will merge the PR since the job was passing before my latest commit, which only changed some integration test code.

@antoninbas antoninbas merged commit c9f5260 into antrea-io:main Feb 25, 2021
@antoninbas antoninbas deleted the cleanup-stale-addresses-on-gw branch February 25, 2021 22:17
antoninbas added a commit to antoninbas/antrea that referenced this pull request Feb 25, 2021
When a Node leaves a K8s cluster and joins it again later, it is likely
that it will receive a different Pod CIDR. If we simply add the new
gateway IP (first IP in the CIDR) to the host gateway interface without
removing the previous one, we have observed that it can lead to
connectivity issues. This commit ensures that all stale IPs are removed
when configuring the gateway interface on an Agent restart. Link-local
addresses are left untouched. An integration test is added for both
Linux & Windows.

Fixes antrea-io#1685
antoninbas added a commit to antoninbas/antrea that referenced this pull request Feb 25, 2021
When a Node leaves a K8s cluster and joins it again later, it is likely
that it will receive a different Pod CIDR. If we simply add the new
gateway IP (first IP in the CIDR) to the host gateway interface without
removing the previous one, we have observed that it can lead to
connectivity issues. This commit ensures that all stale IPs are removed
when configuring the gateway interface on an Agent restart. Link-local
addresses are left untouched. An integration test is added for both
Linux & Windows.

Fixes antrea-io#1685
antoninbas added a commit to antoninbas/antrea that referenced this pull request Feb 26, 2021
When a Node leaves a K8s cluster and joins it again later, it is likely
that it will receive a different Pod CIDR. If we simply add the new
gateway IP (first IP in the CIDR) to the host gateway interface without
removing the previous one, we have observed that it can lead to
connectivity issues. This commit ensures that all stale IPs are removed
when configuring the gateway interface on an Agent restart. Link-local
addresses are left untouched. An integration test is added for both
Linux & Windows.

Fixes antrea-io#1685
antoninbas added a commit to antoninbas/antrea that referenced this pull request Feb 26, 2021
When a Node leaves a K8s cluster and joins it again later, it is likely
that it will receive a different Pod CIDR. If we simply add the new
gateway IP (first IP in the CIDR) to the host gateway interface without
removing the previous one, we have observed that it can lead to
connectivity issues. This commit ensures that all stale IPs are removed
when configuring the gateway interface on an Agent restart. Link-local
addresses are left untouched. An integration test is added for both
Linux & Windows.

Fixes antrea-io#1685
antoninbas added a commit that referenced this pull request Feb 26, 2021
When a Node leaves a K8s cluster and joins it again later, it is likely
that it will receive a different Pod CIDR. If we simply add the new
gateway IP (first IP in the CIDR) to the host gateway interface without
removing the previous one, we have observed that it can lead to
connectivity issues. This commit ensures that all stale IPs are removed
when configuring the gateway interface on an Agent restart. Link-local
addresses are left untouched. An integration test is added for both
Linux & Windows.

Fixes #1685
antoninbas added a commit that referenced this pull request Feb 26, 2021
When a Node leaves a K8s cluster and joins it again later, it is likely
that it will receive a different Pod CIDR. If we simply add the new
gateway IP (first IP in the CIDR) to the host gateway interface without
removing the previous one, we have observed that it can lead to
connectivity issues. This commit ensures that all stale IPs are removed
when configuring the gateway interface on an Agent restart. Link-local
addresses are left untouched. An integration test is added for both
Linux & Windows.

Fixes #1685
antoninbas added a commit to antoninbas/antrea that referenced this pull request Mar 11, 2021
When a Node leaves a K8s cluster and joins it again later, it is likely
that it will receive a different Pod CIDR. If we simply add the new
gateway IP (first IP in the CIDR) to the host gateway interface without
removing the previous one, we have observed that it can lead to
connectivity issues. This commit ensures that all stale IPs are removed
when configuring the gateway interface on an Agent restart. Link-local
addresses are left untouched. An integration test is added for both
Linux & Windows.

Fixes antrea-io#1685
antoninbas added a commit that referenced this pull request Mar 12, 2021
When a Node leaves a K8s cluster and joins it again later, it is likely
that it will receive a different Pod CIDR. If we simply add the new
gateway IP (first IP in the CIDR) to the host gateway interface without
removing the previous one, we have observed that it can lead to
connectivity issues. This commit ensures that all stale IPs are removed
when configuring the gateway interface on an Agent restart. Link-local
addresses are left untouched. An integration test is added for both
Linux & Windows.

Fixes #1685
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When configuring the Antrea gateway IP address, stale addresses should be removed
6 participants