-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify some functions in pkg/agent/route #3889
Conversation
/test-windows-proxyall-e2e |
/test-windows-proxyall-e2e |
/test-windows-e2e |
/test-all |
/test-e2e |
pkg/agent/route/route_windows.go
Outdated
return fmt.Errorf("error listing ip routes: %w", err) | ||
} | ||
for _, rt := range routes { | ||
if rt.GatewayAddress.Equal(gw) && !rt.DestinationSubnet.IP.Equal(svcIP) && rt.DestinationSubnet.Contains(svcIP) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The newly installed route in line 269 also satisfies this condition, so is it possible it is removed by mistake?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since AntreaProxy only has one worker (goroutine), it cannot be removed by mistake since it is only used when the first Service is installed, and curClusterIPCIDR
is nil.
pkg/agent/route/route_windows.go
Outdated
nodeConfig *config.NodeConfig | ||
networkConfig *config.NetworkConfig | ||
nodeRoutes *sync.Map | ||
serviceRoutes *sync.Map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find functions to load route entry from serviceRoutes
, all calls are for store or delete. Is this cache really needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comments to explain what is stored in serviceRoutes
?
Codecov Report
@@ Coverage Diff @@
## main #3889 +/- ##
==========================================
- Coverage 71.95% 67.81% -4.15%
==========================================
Files 406 403 -3
Lines 60766 58449 -2317
==========================================
- Hits 43723 39635 -4088
- Misses 14113 16046 +1933
+ Partials 2930 2768 -162
*This pull request uses carry forward flags. Click here to find out more.
|
pkg/agent/route/route_linux.go
Outdated
} | ||
} | ||
} | ||
|
||
// Remove stale routes. | ||
for _, rt := range staleRoutes { | ||
if err = netlink.RouteDel(rt); err != nil { | ||
return fmt.Errorf("failed to uninstall stale ClusterIP route %s: %w", rt.String(), err) | ||
if err.Error() == "no such process" { | ||
klog.InfoS("Failed to delete stale ClusterIP route since the route doesn't exist", "route", route) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for this log if the routing entry does not exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was introduced in this PR https://github.com/antrea-io/antrea/pull/3827/files.
pkg/agent/route/route_linux.go
Outdated
for i := 0; i < len(routes); i++ { | ||
if routes[i].Gw.Equal(gw) && !routes[i].Dst.IP.Equal(svcIP) && routes[i].Dst.Contains(svcIP) { | ||
// Skip the route whose destination CIDR only has one address. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why skip the routing entry for a single IP address?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The routing entry for a single IP address (mask is 32 or 128) could be for LoadBalancer or the first ClusterIP, and such routing entry should be reserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code comment is unclear and puzzling.
And I feel the above explanation still doesn't make sense. It will never delete stale LoadBalancer IPs, and it doesn't make sense to handle first ClusterIP in this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
// Skip the route whose destination CIDR has only one address since it could be the route for the first
// ClusterIP newly installed. Note that, this could also match a route for a LoadBalancer ingress IP, but
// it doesn't affect skipping the route for the first ClusterIP.
pkg/agent/route/route_windows.go
Outdated
nodeConfig *config.NodeConfig | ||
networkConfig *config.NetworkConfig | ||
nodeRoutes *sync.Map | ||
serviceRoutes *sync.Map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comments to explain what is stored in serviceRoutes
?
701c17b
to
d22cf11
Compare
/test-windows-proxyall-e2e |
/test-windows-proxyall-e2e |
d22cf11
to
70c4aff
Compare
pkg/agent/route/route_windows.go
Outdated
route := generateRoute(svcIPNet, gw, linkIndex, metric) | ||
if err := util.RemoveNetRoute(route); err != nil { | ||
if strings.Contains(err.Error(), "No matching MSFT_NetRoute objects") { | ||
klog.InfoS("Failed to delete LoadBalancer ingress IP route since the route doesn't exist", "route", route) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for this log since the final result is the route is not existing, which is as expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used to consistent with Linux code.
70c4aff
to
14e414e
Compare
/test-windows-proxyall-e2e |
8d51d79
to
400476a
Compare
0d04ed2
to
2761075
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits, otherwise LGTM
pkg/agent/route/route_windows.go
Outdated
if reflect.DeepEqual(rt.DestinationSubnet, util.NewIPNet(c.nodeConfig.GatewayConfig.IPv4)) { | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was there a bug causing this route to be deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for the comment #3889 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know why it is added. But was there a bug causing this route to be deleted? If yes, what's the consequence of missing this route?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there was no bug causing this route to be deleted. The route could be only deleted manually by mistake. This is copied from Linux code, see
antrea/pkg/agent/route/route_linux.go
Line 269 in d7ffc85
// These routes are installed automatically by the kernel when the address is configured on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does listRoutes
contain the route? If yes, does the checks in Reconcile
exclude the route?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, add IP 192.168.2.1 to interface antrea-gw0, and the following routes will be installed automatically.
192.168.2.0 255.255.255.0 On-Link 192.168.2.1 281
192.168.2.1 255.255.255.255 On-Link 192.168.2.1 281
192.168.2.255 255.255.255.255 On-Link 192.168.2.1 281
In listRoutes
, the broadcast route (the third one) will be ignored.
if reflect.DeepEqual(rt.DestinationSubnet, c.nodeConfig.PodIPv4CIDR) {
continue
}
The code above is used to exclude the first route in Reconcile
.
if reflect.DeepEqual(rt.DestinationSubnet, util.NewIPNet(c.nodeConfig.GatewayConfig.IPv4)) {
continue
}
The code above is used to exclude the second route in Reconcile
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm aware of this, but it's not related to my question:
But was there a bug causing this route to be deleted? If yes, what's the consequence of missing this route?
Not sure how to further explain. The check is newly added, right? Then how did it work previously if the route was removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check is newly added, right?
Yes, the check is newly added.
Then how did it work previously if the route was removed?
If I remember correctly, if the route is removed, it will not be restored.
/test-all |
// 10.10.0.0 255.255.255.0 On-link 10.10.0.1 281 | ||
// 10.10.0.255 255.255.255.255 On-link 10.10.0.1 281 | ||
// The host (10.10.0.1) and broadcast (10.10.0.255) routes should be ignored since they are unnecessary and redundant. | ||
// We can ignore them by comparing them to the calculated broadcast IP. | ||
if !rt.GatewayAddress.Equal(config.VirtualServiceIPv4) && rt.DestinationSubnet.IP.Equal(iputil.GetLocalBroadcastIP(rt.DestinationSubnet)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The host route is ignored here since the broadcast IP of the destination IP in host route is the destination itself. @tnqn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then should we remove the new check added here https://github.com/antrea-io/antrea/pull/3889/files/720023a9ed444a790842d8a62db295f70161d315#r1153399542? otherwise the code is contradictory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should remove the check since the route it should match will not be collected in Antrea.
// 10.10.0.0 255.255.255.0 On-link 10.10.0.1 281 | ||
// 10.10.0.255 255.255.255.255 On-link 10.10.0.1 281 | ||
// The host (10.10.0.1) and broadcast (10.10.0.255) routes should be ignored since they are unnecessary and redundant. | ||
// We can ignore them by comparing them to the calculated broadcast IP. | ||
if !rt.GatewayAddress.Equal(config.VirtualServiceIPv4) && rt.DestinationSubnet.IP.Equal(iputil.GetLocalBroadcastIP(rt.DestinationSubnet)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then should we remove the new check added here https://github.com/antrea-io/antrea/pull/3889/files/720023a9ed444a790842d8a62db295f70161d315#r1153399542? otherwise the code is contradictory.
This PR unifies the functions in route_linux.go and route_windows.go: - AddLoadBalancer - DeleteLoadBalancer Remove: - DeleteClusterIPRoute - AddClusterIPRoute Fix: - Add the route for calculated ClusterIP CIDR to Service route cache, then the route can be restored when it was deleted since members of the route cache are synchronized periodically. Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-all |
/test-windows-networkpolicy |
/test-windows-containerd-networkpolicy |
/test-windows-conformance |
/test-windows-containerd-networkpolicy |
This reverts commit 6143e78.
This commit unifies the following functions in route_linux.go and route_windows.go: - AddLoadBalancer - DeleteLoadBalancer Removes: - DeleteClusterIPRoute - AddClusterIPRoute Adds a fix to restore the ClusterIP CIDR route after it is deleted, by saving the route to the Service route cache which will be re-sync'd periodically. Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
This PR unifies the functions in route_linux.go and route_windows.go:
Remove:
Fix:
then the route can be restored when it was deleted since members of
the route cache are synchronized periodically.
Signed-off-by: Hongliang Liu lhongliang@vmware.com