Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zone aware routing #2859

Open
dsvetl opened this issue Aug 31, 2020 · 6 comments
Open

Zone aware routing #2859

dsvetl opened this issue Aug 31, 2020 · 6 comments
Labels
blocked/needs-design Categorizes the issue or PR as blocked because it needs a design document. blocked/needs-info Categorizes the issue or PR as blocked because there is insufficient information to advance it. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature.

Comments

@dsvetl
Copy link

dsvetl commented Aug 31, 2020

Hi, projectcontour,

You need to add the ability to contour to use this functionality in envoy

https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/zone_aware#arch-overview-load-balancing-zone-aware-routing

I wish we can parse pod region/zone to provide this info into xDS API.

@stevesloka
Copy link
Member

I think this would be super interesting @dsvetl! The tricky part is finding out what you describe as a "zone". I know there are labels that get applied to nodes. So a quick thought would be to parse which endpoint is running on which node, then filter that information back into Envoy so it can make decisions on where to route to next.

We'd need to probably come up with a design doc to outline how this might work, how it might integrate into Contour, as well as how users would configure.

@stevesloka stevesloka added the kind/question Categorizes an issue as a user question. label Aug 31, 2020
@jpeach
Copy link
Contributor

jpeach commented Aug 31, 2020

xref kubernetes/enhancements#536

@jpeach
Copy link
Contributor

jpeach commented Sep 8, 2020

@dsvetl This is a pretty open ended project, and we are not likely to tackle is in the immediate future. If you have a proposal for how zones should be defined and handled in Contour+Envoy, we can help you design and implement that.

@jpeach jpeach added blocked/needs-info Categorizes the issue or PR as blocked because there is insufficient information to advance it. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Sep 8, 2020
@nefelim4ag
Copy link

nefelim4ag commented Aug 26, 2021

I think we can take a simple K8s approach first:

  • Set local zone for each Envoy from Environment - just bypass some label from host (looks like useful in general approach)
  • Contour must use the same label for each endpoint to discover zone

That really looks pretty simple and enough to cover most use cases, and make envoy at least aware about zones where it running.

We ignore complex things like Region/Zone and other complex stuff just some label to group by endpoints and envoy hosts.

After that, we can add the ability to change min host count in the zone, because not all applications are really must run 6 instances per zone (for an example most of our are 10-20 pods total in RS)

Both that things for tests, can be implemented by configmap keys for contour controller, like:
zone_label: ...
min_cluster_size: ...


Another approach is to use priority to do the same stuff, but require to generate per envoy xDS config, because Contour must know in which zone envoy are works, who asks config to reorder priority stuff based on zone locality... that's more tricky part (and afaik that how It works in istio)

https://github.com/istio/istio/blob/master/pilot/pkg/networking/core/v1alpha3/loadbalancer/loadbalancer.go#L146

@youngnick
Copy link
Member

Sorry to take a long time to respond here, I've been hoping to spend some time to take a look at some of the newer changes upstream (like NetworkTopology and similar things), before responding, but I have not had bandwidth yet.

I think the rough approach outlined by @nefelim4ag sound reasonable at first glance, but this one will definitely need a design document that runs through:

  • what we're trying to achieve
  • what options are available for keeping track of zones inside Kubernetes, and what are coming?
  • What can Envoy do about locality?

@skriss skriss added blocked/needs-design Categorizes the issue or PR as blocked because it needs a design document. kind/feature Categorizes issue or PR as related to a new feature. and removed kind/question Categorizes an issue as a user question. labels Feb 1, 2022
@nefelim4ag
Copy link

Some time has already passed, and we now have some ready instruments to deal with it:

I already spend some time writing in-house side-car zone routing with Envoy (based on Priority).
Cases with separate contour controllers from envoy nodes are not so obvious to implement as with sidecar.

I see the following design (I can make a PR on request):

Design proposal

Status: Draft

zone-aware-routing-design.md.

Abstract

Zone-aware routing is a GA feature of Kubernetes which is good to support.

Background

This feature was requested in #2859.
Contour is heavily used in cloud deployments, where nodes & pods can be distributed across zones.
Traffic across zones incurs latency and data charges.
Consequently, load distribution with zone-aware traffic routing shall decrease latency and TCO.

Goals

  • Make Contour aware of EndpointSlices.
  • Make Envoy aware of them.
  • Make it useful by spreading network load more locally.

Non Goals

  • Support all possible combinations and fields (Regions, sub-zones, nodes)
  • Make it highly configurable by API objects.

High-Level Design

  • The current Endpoint Translator will be extended to support EndpointSlices.
  • Contour controller will be tracking EndpointSlices of configured envoy service.

Detailed Design

  • EndpointSlices cache + watcher + processor as has been done with endpoints.
  • Return N LocalityLbEndpoints one per locality instead of one.
  • Enrich LocalityLbEndpoints of ClusterLoadAssignment with locality.zone data from EndpointSlices.
  • Create a cache of Envoy EndpointSlices with IP to Zone mapping (the same zones exist in served services).

<TODO: How to inform envoy about such data>

  • Return EDS LocalityLbEndpoints based on xDS client IP?

<TODO: Choice solution Priority or LocalityWeightedLbConfig >

In the case of LocalityWeightedLbConfig:

  • Add new LB method to CRD
  • Enrich LocalityLbEndpoints with weight

In case of Priority:

  • Enrich LocalityLbEndpoints with priority. 0 for local zone, 1 for others.

Alternatives Considered

Make envoy zone aware.
No information on how to properly implement it.
Looks like an East-West solution instead of North-South.

Security Considerations

N/A

Compatibility

Kubernetes 1.21+

Implementation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked/needs-design Categorizes the issue or PR as blocked because it needs a design document. blocked/needs-info Categorizes the issue or PR as blocked because there is insufficient information to advance it. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

6 participants