-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep the cached Endpoints the same as those installed in OVS #4691
Conversation
7ed1fb7
to
2a1a6db
Compare
pkg/agent/proxy/proxier.go
Outdated
// Remove NodePort flows and configurations. | ||
if p.proxyAll && svcInfo.NodePort() > 0 { | ||
if err := p.uninstallNodePortService(uint16(svcInfo.NodePort()), svcInfo.OFProtocol); err != nil { | ||
klog.ErrorS(err, "Failed to remove flows and configurations of Service", "Service", svcPortName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this error log sounds quite generic, given that this case is specific to NodePort Services?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with
klog.ErrorS(err, "Error when uninstalling NodePort flows and configurations for Service", "ServiceName", svcPortName, "ServiceInfo", svcInfoStr)
continue | ||
} | ||
p.groupCounter.Recycle(svcPortName, true) | ||
// Remove Service group which has only local Endpoints. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the svcInfo.ExternalPolicyLocal()
condition was removed, yet the commend still mentions "local Endpoints". Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, when svcInfo.InternalPolicyLocal()
is true, the group has only local Endpoints too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there was a bug regarding this? a group may never be deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there was a bug regarding this? a group may never be deleted?
For a ClusterIP, when svcInfo.InternalPolicyLocal()
is true, a group may never be delete.
pkg/agent/proxy/proxier.go
Outdated
updatedEndpointsInstalled := map[string]k8sproxy.Endpoint{} | ||
for _, endpoint := range allReachableEndpoints { | ||
updatedEndpointsInstalled[endpoint.String()] = endpoint | ||
// If the Endpoint is not newly installed, remove it from stale endpoints. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"remove it from stale endpoints": that doesn't sound correct. You are removing it from endpointsInstalled
, which doesn't sound like a list of stale endpoints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have rephrased the comments here.
pkg/agent/proxy/proxier.go
Outdated
if mcsLocalService != nil { | ||
allReachableEndpoints = append(allReachableEndpoints, mcsLocalService.Endpoint) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, this function (installServices
) is getting very long. Is there a way to extract some functionality into helper functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is quite long for this function, I thought of extracting the code under if needUpdateEndpoints
and if needUpdateService
into functions, but I am not sure if this is an appropriate idea.
2a1a6db
to
96bb35d
Compare
continue | ||
} | ||
p.groupCounter.Recycle(svcPortName, true) | ||
// Remove Service group which has only local Endpoints. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there was a bug regarding this? a group may never be deleted?
96bb35d
to
6913627
Compare
597b534
to
5ed391b
Compare
54f15f9
to
6ab1d39
Compare
fcde930
to
b053114
Compare
continue | ||
} | ||
|
||
// If previous Service is nil or NodePort flows and configurations of previous Service have been removed, | ||
// install NodePort flows and configurations for current Service. | ||
if svcInfo.NodePort() > 0 && (pSvcInfo == nil || needRemoval) { | ||
if err := p.installNodePortService(nGroupID, uint16(svcInfo.NodePort()), svcInfo.OFProtocol, uint16(affinityTimeout), svcInfo.ExternalPolicyLocal()); err != nil { | ||
klog.ErrorS(err, "Failed to install NodePort flows and configurations of Service", "Service", svcPortName) | ||
groupID, exists = p.groupCounter.Get(svcPortName, externalPolicyLocal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: why do you expect the OF group for the NodePort traffic to share the same identifier as the ClusterIP traffic for a Service? In the previous implementation, here calls p.groupCounter.AllocateIfNotExist
to get or create an OF group ID for the new configured NodePort scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the previous implementation, here calls p.groupCounter.AllocateIfNotExist to get or create an OF group ID for the new configured NodePort scenario.
We should use method Get
here to get a OVS group ID, otherwise, there is a chance that we allocate a group ID and pass it to InstallServiceGroup
and the corresponding group was not created in OVS.
pkg/agent/proxy/proxier.go
Outdated
if err := p.ofClient.UninstallServiceFlows(svcInfo.ClusterIP(), uint16(svcInfo.Port()), svcInfo.OFProtocol); err != nil { | ||
klog.ErrorS(err, "Failed to remove flows of Service", "Service", svcPortName) | ||
klog.ErrorS(err, "Error when uninstalling ClusterIP flows for Service", "ServiceName", svcPortName, "ServiceInfo", svcInfoStr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key "ServiceName" in such logs should be "servicePortName"?
BTW, the keys in the logs should not start with a capital according to the logs in other package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key "ServiceName" in such logs should be "servicePortName"?
I'll change that.
BTW, the keys in the logs should not start with a capital according to the logs in other package?
I thought that Service
in ServiceName
means Kubernetes Service, and I capitalized them all.
The main purpose of this PR is to avoid potential inconsistencies between the cached Endpoints and those installed in OVS, like antrea-io#4681, antrea-io#4692. This PR also updates: - Method UninstallEndpointFlows of ofClient, support deleting flows of multiple Endpoints. - Remove possible groups when a Service is deleted. - Log something when a group for a Service is not created. - Optimize and unify log. Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
b053114
to
88a4812
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-all |
@wenyingd do you have other comments? |
No other comments |
…io#4691) The main purpose of this PR is to avoid potential inconsistencies between the cached Endpoints and those installed in OVS, like antrea-io#4681, antrea-io#4692. This PR also updates: - Method UninstallEndpointFlows of ofClient, support deleting flows of multiple Endpoints. - Remove possible groups when a Service is deleted. - Log something when a group for a Service is not created. - Optimize and unify log. Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
The main purpose of this PR is to avoid potential inconsistencies between
the cached Endpoints and those installed in OVS, like #4681, #4692.
This PR also updates:
multiple Endpoints.
Signed-off-by: Hongliang Liu lhongliang@vmware.com