Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cached stale Endpoints cannot be cleaned when Endpoints are filtered out by EndpointSlice conditions or TopologyAwareHints #4692

Closed
hongliangl opened this issue Mar 8, 2023 · 0 comments
Assignees
Labels
area/proxy Issues or PRs related to proxy functions in Antrea kind/bug Categorizes issue or PR as related to a bug.

Comments

@hongliangl
Copy link
Contributor

Describe the bug
Enabled EndpointSlice feature gate in Antrea, failed on Kubernetes sig-network e2e test should create endpoints for unready pods

Root Cause
Assuming that there a Service whose PublishNotReadyAddresses is true, and there is a ready Endpoint of the Service.

  • Make the Endpoint to be terminating and not serving
  • Call
    func (p *proxier) OnEndpointSliceUpdate(oldEndpointSlice, newEndpointSlice *discovery.EndpointSlice) {
  • Call
    func (p *proxier) syncProxyRules() {
    • In
      clusterEndpoints, localEndpoints, allReachableEndpoints, mcsLocalService := p.categorizeEndpoints(endpointsToInstall, svcInfo, svcPortName, serviceCIDRIPv4)
      , since the Endpoint is terminating and not serving, clusterEndpoints, localEndpoints and allReachableEndpoints are empty.
    • Corresponding OVS group will be updated in
      if err = p.ofClient.InstallServiceGroup(groupID, affinityTimeout != 0, mcsLocalService, allReachableEndpoints); err != nil {
      . The updated OVS group doesn't have any bucket since allReachableEndpoints is empty.
    • Note that, Endpoints for the Service in p.endpointsMap and p. endpointsInstalledMap are the same, as a result, when calling
      func (p *proxier) removeStaleEndpoints() {
      , stale Endpoint cannot be cleaned since the uninstalled Endpoints is still in p.endpointsMap.
    • When the Endpoint is ready, since there is a cache for the Endpoint, corresponding OVS group will not be updated, and the bucket for the Endpoint will not be synced to OVS.
@hongliangl hongliangl added the kind/bug Categorizes issue or PR as related to a bug. label Mar 8, 2023
@hongliangl hongliangl self-assigned this Mar 8, 2023
@hongliangl hongliangl added the area/proxy Issues or PRs related to proxy functions in Antrea label Mar 8, 2023
hongliangl added a commit to hongliangl/antrea that referenced this issue Mar 8, 2023
The main purpose of this PR is to avoid potential inconsistencies between
the cached Endpoints and those installed in OVS, like antrea-io#4681, antrea-io#4692.

This PR also updates:

- Method UninstallEndpointFlows of ofClient, support deleting flows of
  multiple Endpoints.
- Remove possible groups when a Service is deleted.
- Log something when a group for a Service is not created.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
hongliangl added a commit to hongliangl/antrea that referenced this issue Mar 9, 2023
The main purpose of this PR is to avoid potential inconsistencies between
the cached Endpoints and those installed in OVS, like antrea-io#4681, antrea-io#4692.

This PR also updates:

- Method UninstallEndpointFlows of ofClient, support deleting flows of
  multiple Endpoints.
- Remove possible groups when a Service is deleted.
- Log something when a group for a Service is not created.
- Optimize and unify log.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
hongliangl added a commit to hongliangl/antrea that referenced this issue Mar 9, 2023
The main purpose of this PR is to avoid potential inconsistencies between
the cached Endpoints and those installed in OVS, like antrea-io#4681, antrea-io#4692.

This PR also updates:

- Method UninstallEndpointFlows of ofClient, support deleting flows of
  multiple Endpoints.
- Remove possible groups when a Service is deleted.
- Log something when a group for a Service is not created.
- Optimize and unify log.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
hongliangl added a commit to hongliangl/antrea that referenced this issue Mar 14, 2023
The main purpose of this PR is to avoid potential inconsistencies between
the cached Endpoints and those installed in OVS, like antrea-io#4681, antrea-io#4692.

This PR also updates:

- Method UninstallEndpointFlows of ofClient, support deleting flows of
  multiple Endpoints.
- Remove possible groups when a Service is deleted.
- Log something when a group for a Service is not created.
- Optimize and unify log.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
hongliangl added a commit to hongliangl/antrea that referenced this issue Mar 14, 2023
The main purpose of this PR is to avoid potential inconsistencies between
the cached Endpoints and those installed in OVS, like antrea-io#4681, antrea-io#4692.

This PR also updates:

- Method UninstallEndpointFlows of ofClient, support deleting flows of
  multiple Endpoints.
- Remove possible groups when a Service is deleted.
- Log something when a group for a Service is not created.
- Optimize and unify log.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
hongliangl added a commit to hongliangl/antrea that referenced this issue Mar 14, 2023
The main purpose of this PR is to avoid potential inconsistencies between
the cached Endpoints and those installed in OVS, like antrea-io#4681, antrea-io#4692.

This PR also updates:

- Method UninstallEndpointFlows of ofClient, support deleting flows of
  multiple Endpoints.
- Remove possible groups when a Service is deleted.
- Log something when a group for a Service is not created.
- Optimize and unify log.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
hongliangl added a commit to hongliangl/antrea that referenced this issue Mar 14, 2023
The main purpose of this PR is to avoid potential inconsistencies between
the cached Endpoints and those installed in OVS, like antrea-io#4681, antrea-io#4692.

This PR also updates:

- Method UninstallEndpointFlows of ofClient, support deleting flows of
  multiple Endpoints.
- Remove possible groups when a Service is deleted.
- Log something when a group for a Service is not created.
- Optimize and unify log.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
hongliangl added a commit to hongliangl/antrea that referenced this issue Mar 15, 2023
The main purpose of this PR is to avoid potential inconsistencies between
the cached Endpoints and those installed in OVS, like antrea-io#4681, antrea-io#4692.

This PR also updates:

- Method UninstallEndpointFlows of ofClient, support deleting flows of
  multiple Endpoints.
- Remove possible groups when a Service is deleted.
- Log something when a group for a Service is not created.
- Optimize and unify log.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
hongliangl added a commit to hongliangl/antrea that referenced this issue Mar 15, 2023
The main purpose of this PR is to avoid potential inconsistencies between
the cached Endpoints and those installed in OVS, like antrea-io#4681, antrea-io#4692.

This PR also updates:

- Method UninstallEndpointFlows of ofClient, support deleting flows of
  multiple Endpoints.
- Remove possible groups when a Service is deleted.
- Log something when a group for a Service is not created.
- Optimize and unify log.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
hongliangl added a commit to hongliangl/antrea that referenced this issue Mar 15, 2023
The main purpose of this PR is to avoid potential inconsistencies between
the cached Endpoints and those installed in OVS, like antrea-io#4681, antrea-io#4692.

This PR also updates:

- Method UninstallEndpointFlows of ofClient, support deleting flows of
  multiple Endpoints.
- Remove possible groups when a Service is deleted.
- Log something when a group for a Service is not created.
- Optimize and unify log.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
hongliangl added a commit to hongliangl/antrea that referenced this issue Mar 15, 2023
The main purpose of this PR is to avoid potential inconsistencies between
the cached Endpoints and those installed in OVS, like antrea-io#4681, antrea-io#4692.

This PR also updates:

- Method UninstallEndpointFlows of ofClient, support deleting flows of
  multiple Endpoints.
- Remove possible groups when a Service is deleted.
- Log something when a group for a Service is not created.
- Optimize and unify log.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
tnqn pushed a commit that referenced this issue Mar 16, 2023
The main purpose of this PR is to avoid potential inconsistencies between
the cached Endpoints and those installed in OVS, like #4681, #4692.

This PR also updates:

- Method UninstallEndpointFlows of ofClient, support deleting flows of
  multiple Endpoints.
- Remove possible groups when a Service is deleted.
- Log something when a group for a Service is not created.
- Optimize and unify log.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
jainpulkit22 pushed a commit to urharshitha/antrea that referenced this issue Apr 28, 2023
…io#4691)

The main purpose of this PR is to avoid potential inconsistencies between
the cached Endpoints and those installed in OVS, like antrea-io#4681, antrea-io#4692.

This PR also updates:

- Method UninstallEndpointFlows of ofClient, support deleting flows of
  multiple Endpoints.
- Remove possible groups when a Service is deleted.
- Log something when a group for a Service is not created.
- Optimize and unify log.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/proxy Issues or PRs related to proxy functions in Antrea kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

1 participant