Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod readiness gates #955

Merged
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/examples/rbac-role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ rules:
- ingresses
- ingresses/status
- services
- pods/status
verbs:
- create
- get
Expand Down
2 changes: 2 additions & 0 deletions docs/guide/ingress/annotation.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ You can add kubernetes annotations to ingress and service objects to customize t
|[alb.ingress.kubernetes.io/success-codes](#success-codes)|string|'200'|ingress,service|
|[alb.ingress.kubernetes.io/tags](#tags)|stringMap|N/A|ingress|
|[alb.ingress.kubernetes.io/target-group-attributes](#target-group-attributes)|stringMap|N/A|ingress,service|
|[alb.ingress.kubernetes.io/target-health-reconciliation-strategy](pod-conditions.md#annotations)|string|'initial'|ingress|
|[alb.ingress.kubernetes.io/target-health-reconciliation-interval-seconds](pod-conditions.md#annotations)|integer|10|ingress|
|[alb.ingress.kubernetes.io/target-type](#target-type)|instance \| ip|instance|ingress,service|
|[alb.ingress.kubernetes.io/unhealthy-threshold-count](#unhealthy-threshold-count)|integer|'2'|ingress,service|
|[alb.ingress.kubernetes.io/waf-acl-id](#waf-acl-id)|string|N/A|ingress|
Expand Down
84 changes: 84 additions & 0 deletions docs/guide/ingress/pod-conditions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# Using pod conditions / pod readiness gates

One can add so-called [»Pod readiness gates«](https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-readiness-gate) to Kubernetes pods. A readiness gate can be used by e.g. a controller to mark a pod as ready or as unready by setting a custom condition on the pod.

The AWS ALB ingress controller can set such a condition on your pods. This is needed under certain circumstances to achieve full zero downtime rolling deployments. Consider the following example:
* low number of replicas in a deployment (e.g. one to three)
* start a rolling update of the deployment
* rollout of new pods takes less time than it takes the ALB ingress controller to register the new pods and for their health state turn »Healthy« in the target group
* at some point during this rolling update, the target group might only have registered targets that are in »Initial« or »Draining« state; this results in service outage

In order to avoid this situation, the AWS ALB ingress controller can set the before mentioned condition on the pods that constitute your ingress backend services. The condition status on a pod will only be set to `True` when the corresponding target in the ALB target group shows a health state of »Healthy«. This prevents the rolling update of a deployment from terminating old pods until the newly created pods are »Healthy« in the ALB target group and ready to take traffic.


## Pod configuration

Add a readiness gate with `conditionType: target-health.alb.ingress.kubernetes.io/<ingress name>_<service name>_<service port>` to your pod.

Example:

```yaml
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
clusterIP: None
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: nginx
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: nginx-ingress
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/target-type: ip
alb.ingress.kubernetes.io/scheme: internal
spec:
rules:
- http:
paths:
- backend:
serviceName: nginx-service
servicePort: 80
path: /*
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
readinessGates:
- conditionType: target-health.alb.ingress.kubernetes.io/nginx-ingress_nginx-service_80
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
```

If your pod is part of multiple ingresses / target groups and you want to make sure your pod is `Healthy` in all of them before it is marked as `Ready`, add one `readinessGate` per ingress.


## <a name="annotations">Ingress annotations</a>

The following annotations can be used on the `Ingress` to control the reconcilation behavior:

* `alb.ingress.kubernetes.io/target-health-reconciliation-strategy`: can be either `initial` (default) or `continuous`
* `initial`: pod condition statuses are only reconiliated as long as the pod is unready; once it becomes ready, reconciliatio is stopped – it is started again if the pod becomes unready during its runtime
* `continuous`: pod condition statuses are reconciled as long as the ingress / target group exists; use with care as this can potentially cause a lot of AWS API calls if there are many target groups
alfredkrohmer marked this conversation as resolved.
Show resolved Hide resolved
* `alb.ingress.kubernetes.io/target-health-reconciliation-interval-seconds`: defines how often the target health is queries from AWS while the reconiliation is running (defaults to 10)
13 changes: 10 additions & 3 deletions internal/alb/tg/mock_Controller.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 9 additions & 2 deletions internal/alb/tg/mock_TargetsController.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 8 additions & 2 deletions internal/alb/tg/targetgroup.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import (
extensions "k8s.io/api/extensions/v1beta1"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/intstr"
"sigs.k8s.io/controller-runtime/pkg/client"
)

// The port used when creating targetGroup serves as a default value for targets registered without port specified.
Expand All @@ -32,11 +33,12 @@ const targetGroupDefaultPort = 1
type Controller interface {
// Reconcile ensures an targetGroup exists for specified backend of ingress.
Reconcile(ctx context.Context, ingress *extensions.Ingress, backend extensions.IngressBackend) (TargetGroup, error)
StopReconcilingPodConditionStatus(tgArn string)
}

func NewController(cloud aws.CloudAPI, store store.Storer, nameTagGen NameTagGenerator, tagsController tags.Controller, endpointResolver backend.EndpointResolver) Controller {
func NewController(cloud aws.CloudAPI, store store.Storer, nameTagGen NameTagGenerator, tagsController tags.Controller, endpointResolver backend.EndpointResolver, client client.Client) Controller {
attrsController := NewAttributesController(cloud)
targetsController := NewTargetsController(cloud, endpointResolver)
targetsController := NewTargetsController(cloud, endpointResolver, client)
return &defaultController{
cloud: cloud,
store: store,
Expand Down Expand Up @@ -115,6 +117,10 @@ func (controller *defaultController) Reconcile(ctx context.Context, ingress *ext
}, nil
}

func (controller *defaultController) StopReconcilingPodConditionStatus(tgArn string) {
controller.targetsController.StopReconcilingPodConditionStatus(tgArn)
}

func (controller *defaultController) newTGInstance(ctx context.Context, name string, serviceAnnos *annotations.Service, healthCheckPort string) (*elbv2.TargetGroup, error) {
albctx.GetLogger(ctx).Infof("creating target group %v", name)
resp, err := controller.cloud.CreateTargetGroupWithContext(ctx, &elbv2.CreateTargetGroupInput{
Expand Down
7 changes: 5 additions & 2 deletions internal/alb/tg/targetgroup_group.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import (
extensions "k8s.io/api/extensions/v1beta1"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/sets"
"sigs.k8s.io/controller-runtime/pkg/client"
)

// GroupController manages all target groups for one ingress.
Expand All @@ -37,8 +38,9 @@ func NewGroupController(
store store.Storer,
nameTagGen NameTagGenerator,
tagsController tags.Controller,
endpointResolver backend.EndpointResolver) GroupController {
tgController := NewController(cloud, store, nameTagGen, tagsController, endpointResolver)
endpointResolver backend.EndpointResolver,
client client.Client) GroupController {
tgController := NewController(cloud, store, nameTagGen, tagsController, endpointResolver, client)
return &defaultGroupController{
cloud: cloud,
store: store,
Expand Down Expand Up @@ -97,6 +99,7 @@ func (controller *defaultGroupController) GC(ctx context.Context, tgGroup Target
unusedTgArns := currentTgArns.Difference(usedTgArns)
for arn := range unusedTgArns {
albctx.GetLogger(ctx).Infof("deleting target group %v", arn)
controller.tgController.StopReconcilingPodConditionStatus(arn)
if err := controller.cloud.DeleteTargetGroupByArn(ctx, arn); err != nil {
return fmt.Errorf("failed to delete targetGroup due to %v", err)
}
Expand Down
6 changes: 6 additions & 0 deletions internal/alb/tg/targetgroup_group_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -710,6 +710,9 @@ func TestDefaultGroupController_GC(t *testing.T) {
}
mockNameTagGen := &MockNameTagGenerator{}
mockTGController := &MockController{}
for _, call := range tc.DeleteTargetGroupByArnCalls {
mockTGController.On("StopReconcilingPodConditionStatus", call.Arn).Return()
}

controller := &defaultGroupController{
cloud: cloud,
Expand Down Expand Up @@ -818,6 +821,9 @@ func TestDefaultGroupController_Delete(t *testing.T) {
mockNameTagGen.On("TagTGGroup", tc.TagTGGroupCall.Namespace, tc.TagTGGroupCall.IngressName).Return(tc.TagTGGroupCall.Tags)
}
mockTGController := &MockController{}
for _, call := range tc.DeleteTargetGroupByArnCalls {
mockTGController.On("StopReconcilingPodConditionStatus", call.Arn).Return()
}

controller := &defaultGroupController{
cloud: cloud,
Expand Down
Loading