Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EndpointSlice Support #2696

Closed
andrewsykim opened this issue Jul 17, 2020 · 17 comments · Fixed by #5745
Closed

EndpointSlice Support #2696

andrewsykim opened this issue Jul 17, 2020 · 17 comments · Fixed by #5745
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@andrewsykim
Copy link
Contributor

EndpointSlice is now enabled by default in Kubernetes and offers some solid performance benefits for clusters with large Endpoint sets. It might be worthwhile to support consuming EndpointSlice, at least as optional.

@youngnick
Copy link
Member

Thanks for raising this @andrewsykim, totally agree that we should support consuming EndpointSlice in some fashion.

@youngnick youngnick added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 19, 2020
@zianke
Copy link
Contributor

zianke commented Jul 29, 2020

Hi Team. Agree that Contour should support EndpointSlice in the future. I just discussed with @stevesloka and would like to start working on an experimental implementation. Will send you an update once it's done.

@stevesloka
Copy link
Member

@zianke has a branch which implements endpoint slices (master...zianke:endpointslice) which is fantastic! 🎉

Before pushing up a PR to implement to better help shape the PR, when would we want to add support for EndpointSlices? It's still beta in Kubernetes v1.17, so we. can't fully move to it. Should probably add a feature flag to enable which would disable the current Endpoints implementation.

After we get this merged we'd also need to build out some E2E tests which mock up the behavior.

@youngnick
Copy link
Member

Great work on getting something done so quickly @zianke!

However, given that we're working on the Endpoint translator at the moment, this may be something that will need refactoring if we change anything about EndpointTranslator or ClusterLoadAssignment.

I'd also like to do our best not to add feature flags for enabling Kinds - in particular, EndpointSlice is designed so that if it's available, you can use either it or Endpoints, relatively transparently. I'd prefer to see detection of enabled Kinds used rather than a feature flag.

@dprotaso
Copy link

Hey following up what's the status of EndpointSlices? and does Contour currently support External Service Routing with EndpointSlices with the CNAME address type?

@youngnick
Copy link
Member

Thinking more about this, I think that we should confirm what K8s version EndpointSlice is GA in, and if that's less than our minimum supported (which it almost certainly is), we should move to using EndpointSlices exclusively (since Kube will ensure that changes are propagated as necessary back to Endpoints).

There are some things that wee may need to be careful about, since I believe that EndpointSlices can be used for other stuff (like External Service Routing and locality routing), so we will also need to validate if we need to do anything special with the resource, or if we can just migrate the simple logic we have now to the new resource.

But I agree we should look at this one soon. @xaleeks, what do you think?

@dprotaso
Copy link

I think that we should confirm what K8s version EndpointSlice is GA

Beta in 1.17, GA in 1.21

GA doesn't necessarily mean all things are supported since it depends on the DNS provider for the cluster. I think it just means the API is stable.

For example - CoreDNS EndpointSlice support was added 1.8.4 which is bundled with K8s 1.22 (via kubeadm).

Note you may need to watch a subset of Endpoints objects that aren't mirrored as EndpointSlices - details here: https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#endpointslice-mirroring

@xaleeks
Copy link

xaleeks commented Aug 31, 2021

@youngnick Yup, we should support this. However, EndpointSlice started in k8s 1.17 it seems. Do we need to add a check that the version of K8s running supports it before Contour is deployed? I guess all the supported versions of Contour today would necessitate a version of k8s that supports it so we should be in the clear.

Tagging it for 1.20 for now to start investigating but we can also slot this in the 1.21

@xaleeks
Copy link

xaleeks commented Aug 31, 2021

Also according to https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/, it allows for setting a "topology.kubernetes.io/zone" key. Which opens us to this https://kubernetes.io/docs/concepts/services-networking/service-topology/

So a client can hit his preferred node specified in the topologyKeys for his service when using Ingress . I’m not sure if that means this ServiceTopology feature is unsupported unless we support EndpointSlice. Interesting opportunity for anyone looking to utilize this.

@youngnick
Copy link
Member

If we are already mandating a newer version of Kubernetes than 1.17 (which I think we are), then we just need to call out when we start doing EndpointSlices that 1.17 (or whatever version) is now required.

The zone topology is an interesting idea, yeah.

@howardjohn
Copy link

topologyKeys is deprecated fwiw. replaced by https://kubernetes.io/docs/concepts/services-networking/topology-aware-hints/

@youngnick youngnick added this to the 1.20.0 milestone Nov 2, 2021
@skriss skriss modified the milestones: 1.20.0, 1.21.0 Jan 4, 2022
@skriss skriss modified the milestones: 1.21.0, 1.22.0 May 3, 2022
@skriss skriss modified the milestones: 1.22.0, 1.23.0 Jul 21, 2022
@sunjayBhatia sunjayBhatia added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Sep 26, 2022
@arjunsalyan
Copy link
Contributor

I can take this up. I was also able to write an EndpointSliceTranslator, however I am a bit confused about the part where we observe the k8s resources (endpoints etc.). The name of EndpointSlice has an additional suffix and I am not really sure if that will come into picture while reading the slice. If someone could help me with any dev document or simply where to look in the code for reading the k8s resources - that would be helpful.

@dprotaso
Copy link

Related Gateway Issue hopes to exercise different Service/Endpoint[Slice] variants - kubernetes-sigs/gateway-api#1718

@dprotaso
Copy link

Also related EndpointSlices/FQDN could be a replacement for Service type=ExternalName

#3950

@orenwolf
Copy link

Confirming that this appears to create a hard cap of 1000 replicas that contour can track for a given service, as non-sliced endpoints are limited to 1000 members.

@Quentin-M
Copy link

Quentin-M commented Jun 23, 2023

This would also allow usage of Topology Aware Routing, which was added back in Kubernetes as EndpointSlice hints since 1.23.

@rajatvig rajatvig self-assigned this Jun 23, 2023
@rajatvig
Copy link
Member

Will plan to work on adding support for this.

@skriss skriss assigned clayton-gonsalves and unassigned rajatvig Aug 28, 2023
@skriss skriss added this to the 1.27.0 milestone Aug 28, 2023
skriss pushed a commit that referenced this issue Sep 15, 2023
Updates #2696.

Signed-off-by: Clayton Gonsalves <clayton.gonsalves@reddit.com>
skriss pushed a commit that referenced this issue Oct 24, 2023
Adds support for EndpointSlices, gated by the
feature gate "useEndpointSlices".

Fixes #2696.

Signed-off-by: Clayton Gonsalves <clayton.gonsalves@reddit.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.