diff --git a/keps/sig-network/2433-topology-aware-hints/README.md b/keps/sig-network/2433-topology-aware-hints/README.md index aacc335a9b01..6e210b42a825 100644 --- a/keps/sig-network/2433-topology-aware-hints/README.md +++ b/keps/sig-network/2433-topology-aware-hints/README.md @@ -136,11 +136,11 @@ create an EndpointSlice with endpoints that look like this: zone: "zone-a" hints: zone: "zone-a" -- addresses: ["10.1.2.3"] +- addresses: ["10.1.2.4"] zone: "zone-b" hints: zone: "zone-b" -- addresses: ["10.1.2.3"] +- addresses: ["10.1.2.5"] zone: "zone-a" hints: zone: "zone-c" @@ -151,7 +151,8 @@ hints help ensure that each zone will have a single endpoint to consume by adding a hint to the third endpoint that it should be consumed by "zone-c". This functionality will be enabled by a `TopologyAwareHints` feature gate along -with a Service annotation. +with the `trafficPolicy` field on Service that will be added as part of KEP +2086. ### Risks and Mitigations @@ -196,29 +197,27 @@ running in. ### Configuration -A new `service.kubernetes.io/topology-aware-routing` annotation will be -supported on Services. This will have 2 potential values: +The new Service `trafficPolicy` field will be expanded to support a new value: -- `Auto`: When there are a sufficient number of endpoints for the Service, the - EndpointSlice controller will add topology hints for each endpoint that will - ensure a proportional amounts are available to each zone in a cluster. -- `Disabled`: EndpointSlice hints will not be set for this feature. +- `PreferZone`: When there are a sufficient number of endpoints for the Service, + the EndpointSlice controller will add topology hints for each endpoint that + will ensure a proportional amounts are available to each zone in a cluster. -When this annotation is either unspecified or not set to `Auto`, the `Disabled` -behavior will be default. A future KEP will explore changing the default -behavior from `Disabled` to `Auto` with a new feature gate. +A future KEP will explore changing the default value of this field to +`PreferZone`. #### Interoperability -If both this annotation and the deprecated `topologyKeys` field are set on a -Service, `topologyKeys` will be given precedence and this annotation will be -ignored. This will be true until the `topologyKeys` field is removed in the -future. +Validation will ensure that `trafficPolicy` can not be set to `PreferZone` when +the deprecated `topologyKeys` field is also set. This will be true until the +`topologyKeys` field is removed in the future. #### Feature Gate -This functionality will be guarded by the `TopologyAwareHints` feature -gate. This gate should not be enabled if the `ServiceTopology` gate is enabled. +This functionality will be guarded by the `TopologyAwareHints` feature gate. +This gate will be dependent on the `ServiceInternalTrafficPolicy` feature gate +since it uses the `TrafficPolicy` guarded by that gate. This gate should not be +enabled if the deprecated `ServiceTopology` gate is enabled. ### API @@ -279,8 +278,7 @@ conditions are true: - Kube-Proxy is able to determine the zone it is running within (likely based on node labels). -- The `service.kubernetes.io/topology-aware-routing` annotation is set to - `Auto` for the Service. +- The `trafficPolicy` field is set to `PreferZone` for the Service. - At least one endpoint for the Service has a hint pointing to the zone Kube-Proxy is running within. - All endpoints for the Service have zone hints. @@ -296,11 +294,10 @@ had not yet propagated to all of them. ### EndpointSlice Controller -When the `TopologyAwareHints` feature gate is enabled and the -`service.kubernetes.io/topology-aware-routing` annotation is set to `Auto` -for a Service, the EndpointSlice controller will add hints to EndpointSlices. -These hints will indicate where an endpoint should be consumed by proxy -implementations to enable topology aware routing. +When the `TopologyAwareHints` feature gate is enabled and the `trafficPolicy` +field is set to `PreferZone` for a Service, the EndpointSlice controller will +add hints to EndpointSlices. These hints will indicate where an endpoint should +be consumed by proxy implementations to enable topology aware routing. The EndpointSlice controller will determine how many endpoints should be available for each zone based on the proportion of CPU cores in each zone. If @@ -366,7 +363,7 @@ In the future we may expand this functionality if needed. This could include: - A new `RequireZone` algorithm that would keep endpoints in EndpointSlices for the same zone they are in. -- A new option to specify a minimum threshold for the `Auto` approach. +- A new option to specify a minimum threshold for the `PreferZone` approach. - Support for region based hints. ### Test Plan @@ -374,13 +371,13 @@ In the future we may expand this functionality if needed. This could include: #### Controller Unit Tests | Test Description | Expected Result | | :--- | :--- | -| Feature Gate On, Annotation == 'Auto', 2+ zones | Hints set | -| Feature Gate On, Annotation == 'Auto', 1 zone | No hints set | -| Feature Gate On, Annotation == 'On', 2+ zones | No hints | -| Feature Gate On, Annotation Off, 2+ zones | No hints | -| Feature Gate Off, Annotation On, 2+ zones | No hints | -| Feature Gate Off, Annotation Off, 2+ zones | No hints | -| Feature Gate Off, Annotation Off, 2+ zones | No hints | +| Feature Gate On, TrafficPolicy == 'PreferZone', 2+ zones | Hints set | +| Feature Gate On, TrafficPolicy == 'PreferZone', 1 zone | No hints set | +| Feature Gate On, TrafficPolicy == 'Local', 2+ zones | No hints | +| Feature Gate On, TrafficPolicy Unset, 2+ zones | No hints | +| Feature Gate Off, TrafficPolicy == 'PreferZone', 2+ zones | No hints | +| Feature Gate Off, TrafficPolicy Unset, 2+ zones | No hints | +| Feature Gate Off, TrafficPolicy Unset, 2+ zones | No hints | | 2 endpoints, 3 zones | No hints | | 3 endpoints, 3 zones | Hints set | | 4 endpoints, 3 zones | No hints | @@ -397,10 +394,10 @@ In the future we may expand this functionality if needed. This could include: #### Kube-Proxy Unit Tests | Test Description | Expected Result | | :--- | :--- | -| Feature Gate On, Annotation == 'Auto', hints matching zone | Endpoints filtered | -| Feature Gate On, Annotation == 'On', hints matching zone | Endpoints not filtered | -| Feature Gate Off, Annotation == 'Auto', hints matching zone | Endpoints not filtered | -| Feature Gate On, Annotation == 'Auto', no hints matching zone | Endpoints not filtered | +| Feature Gate On, TrafficPolicy == 'PreferZone', hints matching zone | Endpoints filtered | +| Feature Gate On, TrafficPolicy == 'Local', hints matching zone | Endpoints not filtered | +| Feature Gate Off, TrafficPolicy == 'PreferZone', hints matching zone | Endpoints not filtered | +| Feature Gate On, TrafficPolicy == 'PreferZone', no hints matching zone | Endpoints not filtered | ### Observability We can reuse some of the metrics of EndpointSlice Controller that we already @@ -408,23 +405,12 @@ have in the current version to observe the changes of endpoints (addition, deletion and update). Meanwhile we can add more metrics to have a glimpse of different approaches. -- `endpoint_slice_controller/endpoint_overload_by_zone` - `endpoint_slice_controller/endpointslices_changed_per_sync` - `endpoint_slice_controller/syncs` ``` const SubSystem = "endpoint_slice_controller" -// This metric observes overload per zone for endpoints in EndpointSlices -EPSOverloadByZone = metrics.NewHistogramVec( - &metrics.HistogramOpts{ - Subsystem: Subsystem, - Name: "endpoint_overload_by_zone", - Help: "Overload for endpoints by zone on each Service sync", - }, - []string{"zone"}, // zone name -) - // This metric observes churn of EndpointSlices per sync EPSChangedPerSync = metrics.NewHistogramVec( &metrics.HistogramOpts{ @@ -463,7 +449,7 @@ Thus there could be two potential version skew scenarios: of the new controller functionality. Each scenario described above will end up behaving as if this feature is not -enabled even if the Service annotation has been set. +enabled even if the `trafficPolicy` has been set on Service. ## Production Readiness Review Questionnaire @@ -482,7 +468,7 @@ enabled even if the Service annotation has been set. * **Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?** Yes. It can easily be disabled universally by turning off the feature gate or - on a specific Service by setting the annotation to `Disable`. + setting the `trafficPolicy` field to some other value for a Service. * **What happens if we reenable the feature if it was previously rolled back?** EndpointSlices hints will be added again resulting in changes to existing diff --git a/keps/sig-network/2433-topology-aware-hints/kep.yaml b/keps/sig-network/2433-topology-aware-hints/kep.yaml index 987f1120ad93..93845ee6ab6f 100644 --- a/keps/sig-network/2433-topology-aware-hints/kep.yaml +++ b/keps/sig-network/2433-topology-aware-hints/kep.yaml @@ -20,6 +20,7 @@ see-also: - "https://docs.google.com/document/d/1ZzUoFY1SrdjVefl7gVOJZJLt1I1LHttw8pcX95nlgMY/edit?usp=sharing" - "github.com/kubernetes/enhancements/blob/master/keps/sig-network/2004-topology-aware-subsetting" - "github.com/kubernetes/enhancements/blob/master/keps/sig-network/2030-topology-aware-proxying" + - "github.com/kubernetes/enhancements/blob/master/keps/sig-network/2086-service-internal-traffic-policy" replaces: - "github.com/kubernetes/enhancements/tree/master/keps/sig-network/536-topology-aware-routing"