Skip to content

Commit

Permalink
Transitioning to use of TrafficPolicy field
Browse files Browse the repository at this point in the history
  • Loading branch information
robscott committed Feb 8, 2021
1 parent 1908b5e commit dbca8e5
Show file tree
Hide file tree
Showing 2 changed files with 37 additions and 50 deletions.
86 changes: 36 additions & 50 deletions keps/sig-network/2433-topology-aware-hints/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,11 +136,11 @@ create an EndpointSlice with endpoints that look like this:
zone: "zone-a"
hints:
zone: "zone-a"
- addresses: ["10.1.2.3"]
- addresses: ["10.1.2.4"]
zone: "zone-b"
hints:
zone: "zone-b"
- addresses: ["10.1.2.3"]
- addresses: ["10.1.2.5"]
zone: "zone-a"
hints:
zone: "zone-c"
Expand All @@ -151,7 +151,8 @@ hints help ensure that each zone will have a single endpoint to consume by
adding a hint to the third endpoint that it should be consumed by "zone-c".

This functionality will be enabled by a `TopologyAwareHints` feature gate along
with a Service annotation.
with the `trafficPolicy` field on Service that will be added as part of KEP
2086.

### Risks and Mitigations

Expand Down Expand Up @@ -196,29 +197,27 @@ running in.

### Configuration

A new `service.kubernetes.io/topology-aware-routing` annotation will be
supported on Services. This will have 2 potential values:
The new Service `trafficPolicy` field will be expanded to support a new value:

- `Auto`: When there are a sufficient number of endpoints for the Service, the
EndpointSlice controller will add topology hints for each endpoint that will
ensure a proportional amounts are available to each zone in a cluster.
- `Disabled`: EndpointSlice hints will not be set for this feature.
- `PreferZone`: When there are a sufficient number of endpoints for the Service,
the EndpointSlice controller will add topology hints for each endpoint that
will ensure a proportional amounts are available to each zone in a cluster.

When this annotation is either unspecified or not set to `Auto`, the `Disabled`
behavior will be default. A future KEP will explore changing the default
behavior from `Disabled` to `Auto` with a new feature gate.
A future KEP will explore changing the default value of this field to
`PreferZone`.

#### Interoperability

If both this annotation and the deprecated `topologyKeys` field are set on a
Service, `topologyKeys` will be given precedence and this annotation will be
ignored. This will be true until the `topologyKeys` field is removed in the
future.
Validation will ensure that `trafficPolicy` can not be set to `PreferZone` when
the deprecated `topologyKeys` field is also set. This will be true until the
`topologyKeys` field is removed in the future.

#### Feature Gate

This functionality will be guarded by the `TopologyAwareHints` feature
gate. This gate should not be enabled if the `ServiceTopology` gate is enabled.
This functionality will be guarded by the `TopologyAwareHints` feature gate.
This gate will be dependent on the `ServiceInternalTrafficPolicy` feature gate
since it uses the `TrafficPolicy` guarded by that gate. This gate should not be
enabled if the deprecated `ServiceTopology` gate is enabled.

### API

Expand Down Expand Up @@ -279,8 +278,7 @@ conditions are true:

- Kube-Proxy is able to determine the zone it is running within (likely based
on node labels).
- The `service.kubernetes.io/topology-aware-routing` annotation is set to
`Auto` for the Service.
- The `trafficPolicy` field is set to `PreferZone` for the Service.
- At least one endpoint for the Service has a hint pointing to the zone
Kube-Proxy is running within.
- All endpoints for the Service have zone hints.
Expand All @@ -296,11 +294,10 @@ had not yet propagated to all of them.

### EndpointSlice Controller

When the `TopologyAwareHints` feature gate is enabled and the
`service.kubernetes.io/topology-aware-routing` annotation is set to `Auto`
for a Service, the EndpointSlice controller will add hints to EndpointSlices.
These hints will indicate where an endpoint should be consumed by proxy
implementations to enable topology aware routing.
When the `TopologyAwareHints` feature gate is enabled and the `trafficPolicy`
field is set to `PreferZone` for a Service, the EndpointSlice controller will
add hints to EndpointSlices. These hints will indicate where an endpoint should
be consumed by proxy implementations to enable topology aware routing.

The EndpointSlice controller will determine how many endpoints should be
available for each zone based on the proportion of CPU cores in each zone. If
Expand Down Expand Up @@ -366,21 +363,21 @@ In the future we may expand this functionality if needed. This could include:

- A new `RequireZone` algorithm that would keep endpoints in EndpointSlices for
the same zone they are in.
- A new option to specify a minimum threshold for the `Auto` approach.
- A new option to specify a minimum threshold for the `PreferZone` approach.
- Support for region based hints.

### Test Plan

#### Controller Unit Tests
| Test Description | Expected Result |
| :--- | :--- |
| Feature Gate On, Annotation == 'Auto', 2+ zones | Hints set |
| Feature Gate On, Annotation == 'Auto', 1 zone | No hints set |
| Feature Gate On, Annotation == 'On', 2+ zones | No hints |
| Feature Gate On, Annotation Off, 2+ zones | No hints |
| Feature Gate Off, Annotation On, 2+ zones | No hints |
| Feature Gate Off, Annotation Off, 2+ zones | No hints |
| Feature Gate Off, Annotation Off, 2+ zones | No hints |
| Feature Gate On, TrafficPolicy == 'PreferZone', 2+ zones | Hints set |
| Feature Gate On, TrafficPolicy == 'PreferZone', 1 zone | No hints set |
| Feature Gate On, TrafficPolicy == 'Local', 2+ zones | No hints |
| Feature Gate On, TrafficPolicy Unset, 2+ zones | No hints |
| Feature Gate Off, TrafficPolicy == 'PreferZone', 2+ zones | No hints |
| Feature Gate Off, TrafficPolicy Unset, 2+ zones | No hints |
| Feature Gate Off, TrafficPolicy Unset, 2+ zones | No hints |
| 2 endpoints, 3 zones | No hints |
| 3 endpoints, 3 zones | Hints set |
| 4 endpoints, 3 zones | No hints |
Expand All @@ -397,34 +394,23 @@ In the future we may expand this functionality if needed. This could include:
#### Kube-Proxy Unit Tests
| Test Description | Expected Result |
| :--- | :--- |
| Feature Gate On, Annotation == 'Auto', hints matching zone | Endpoints filtered |
| Feature Gate On, Annotation == 'On', hints matching zone | Endpoints not filtered |
| Feature Gate Off, Annotation == 'Auto', hints matching zone | Endpoints not filtered |
| Feature Gate On, Annotation == 'Auto', no hints matching zone | Endpoints not filtered |
| Feature Gate On, TrafficPolicy == 'PreferZone', hints matching zone | Endpoints filtered |
| Feature Gate On, TrafficPolicy == 'Local', hints matching zone | Endpoints not filtered |
| Feature Gate Off, TrafficPolicy == 'PreferZone', hints matching zone | Endpoints not filtered |
| Feature Gate On, TrafficPolicy == 'PreferZone', no hints matching zone | Endpoints not filtered |

### Observability
We can reuse some of the metrics of EndpointSlice Controller that we already
have in the current version to observe the changes of endpoints (addition,
deletion and update). Meanwhile we can add more metrics to have a glimpse of
different approaches.

- `endpoint_slice_controller/endpoint_overload_by_zone`
- `endpoint_slice_controller/endpointslices_changed_per_sync`
- `endpoint_slice_controller/syncs`

```
const SubSystem = "endpoint_slice_controller"
// This metric observes overload per zone for endpoints in EndpointSlices
EPSOverloadByZone = metrics.NewHistogramVec(
&metrics.HistogramOpts{
Subsystem: Subsystem,
Name: "endpoint_overload_by_zone",
Help: "Overload for endpoints by zone on each Service sync",
},
[]string{"zone"}, // zone name
)
// This metric observes churn of EndpointSlices per sync
EPSChangedPerSync = metrics.NewHistogramVec(
&metrics.HistogramOpts{
Expand Down Expand Up @@ -463,7 +449,7 @@ Thus there could be two potential version skew scenarios:
of the new controller functionality.

Each scenario described above will end up behaving as if this feature is not
enabled even if the Service annotation has been set.
enabled even if the `trafficPolicy` has been set on Service.

## Production Readiness Review Questionnaire

Expand All @@ -482,7 +468,7 @@ enabled even if the Service annotation has been set.
* **Can the feature be disabled once it has been enabled (i.e. can we roll back
the enablement)?**
Yes. It can easily be disabled universally by turning off the feature gate or
on a specific Service by setting the annotation to `Disable`.
setting the `trafficPolicy` field to some other value for a Service.

* **What happens if we reenable the feature if it was previously rolled back?**
EndpointSlices hints will be added again resulting in changes to existing
Expand Down
1 change: 1 addition & 0 deletions keps/sig-network/2433-topology-aware-hints/kep.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ see-also:
- "https://docs.google.com/document/d/1ZzUoFY1SrdjVefl7gVOJZJLt1I1LHttw8pcX95nlgMY/edit?usp=sharing"
- "github.com/kubernetes/enhancements/blob/master/keps/sig-network/2004-topology-aware-subsetting"
- "github.com/kubernetes/enhancements/blob/master/keps/sig-network/2030-topology-aware-proxying"
- "github.com/kubernetes/enhancements/blob/master/keps/sig-network/2086-service-internal-traffic-policy"
replaces:
- "github.com/kubernetes/enhancements/tree/master/keps/sig-network/536-topology-aware-routing"

Expand Down

0 comments on commit dbca8e5

Please sign in to comment.