Skip to content

Commit

Permalink
fix: retarget blue-green previewService before scaling up preview Rep…
Browse files Browse the repository at this point in the history
…licaSet (#1368)

Signed-off-by: Jesse Suen <jesse_suen@intuit.com>
  • Loading branch information
jessesuen authored Aug 6, 2021
1 parent ffe70da commit 7a0704a
Show file tree
Hide file tree
Showing 11 changed files with 63 additions and 35 deletions.
29 changes: 20 additions & 9 deletions docs/features/bluegreen.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,25 @@ spec:
scaleDownDelayRevisionLimit: *int32
```
## Sequence of Events
The following describes the sequence of events that happen during a blue-green update.
1. Beginning at a fully promoted, steady-state, a revision 1 ReplicaSet is pointed to by both the `activeService` and `previewService`.
1. A user initiates an update by modifying the pod template (`spec.template.spec`).
1. The revision 2 ReplicaSet is created with size 0.
1. The preview service is modified to point to the revision 2 ReplicaSet. The `activeService` remains pointing to revision 1.
1. The revision 2 ReplicaSet is scaled to either `spec.replicas` or `previewReplicaCount` if set.
1. Once revision 2 ReplicaSet Pods are fully available, `prePromotionAnalysis` begins.
1. Upon success of `prePromotionAnalysis`, the blue/green pauses if `autoPromotionEnabled` is false, or `autoPromotionSeconds` is non-zero.
1. The rollout is resumed either manually by a user, or automatically by surpassing `autoPromotionSeconds`.
1. The revision 2 ReplicaSet is scaled to the `spec.replicas`, if the `previewReplicaCount` feature was used.
1. The rollout "promotes" the revision 2 ReplicaSet by updating the `activeService` to point to it. At this point, there are no services pointing to revision 1
1. `postPromotionAnalysis` analysis begins
1. Once `postPromotionAnalysis` completes successfully, the update is successful and the revision 2 ReplicaSet is marked as stable. The rollout is considered fully-promoted.
1. After waiting `scaleDownDelaySeconds` (default 30 seconds), the revision 1 ReplicaSet is scaled down


### autoPromotionEnabled
The AutoPromotionEnabled will make the rollout automatically promote the new ReplicaSet to the active service once the new ReplicaSet is healthy. This field is defaulted to true if it is not specified.

Expand Down Expand Up @@ -111,15 +130,6 @@ This feature is used to provide an endpoint that can be used to test a new versi

Defaults to an empty string

Here is a timeline of how the active and preview services work (if you use a preview service):

1. During the Initial deployment there is only one ReplicaSet. Both active and preview services point to it. This is the **old** version of the application.
1. A change happens in the Rollout resource. A new ReplicaSet is created. This is the **new** version of the application. The preview service is modified to point to the new ReplicaSet. The active service still points to the old version.
1. The blue/green deployment is "promoted". Both active and preview services are pointing to the new version. The old version is still there but no service is pointing at it.
1. Once the the blue/green deployment is scaled down (see the `scaleDownDelaySeconds` field) the old ReplicaSet is has 0 replicas and we are back to the initial state. Both active and preview services point to the new version (which is the only one present anyway)



### previewReplicaCount
The PreviewReplicaCount field will indicate the number of replicas that the new version of an application should run. Once the application is ready to promote to the active service, the controller will scale the new ReplicaSet to the value of the `spec.replicas`. The rollout will not switch over the active service to the new ReplicaSet until it matches the `spec.replicas` count.

Expand All @@ -136,3 +146,4 @@ Defaults to 30
The ScaleDownDelayRevisionLimit limits the number of old active ReplicaSets to keep scaled up while they wait for the scaleDownDelay to pass after being removed from the active service.

If omitted, all ReplicaSets will be retained for the specified scaleDownDelay

3 changes: 3 additions & 0 deletions rollout/analysis_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1517,6 +1517,7 @@ func TestDoNotCreateBackgroundAnalysisRunOnNewCanaryRollout(t *testing.T) {

f.expectCreateReplicaSetAction(rs1)
f.expectUpdateRolloutStatusAction(r1) // update conditions
f.expectUpdateReplicaSetAction(rs1) // scale replica set
f.expectPatchRolloutAction(r1)
f.run(getKey(r1, t))
}
Expand Down Expand Up @@ -1551,6 +1552,7 @@ func TestDoNotCreateBackgroundAnalysisRunOnNewCanaryRolloutStableRSEmpty(t *test

f.expectCreateReplicaSetAction(rs1)
f.expectUpdateRolloutStatusAction(r1) // update conditions
f.expectUpdateReplicaSetAction(rs1) // scale replica set
f.expectPatchRolloutAction(r1)
f.run(getKey(r1, t))
}
Expand Down Expand Up @@ -1686,6 +1688,7 @@ func TestDoNotCreatePrePromotionAnalysisRunOnNewRollout(t *testing.T) {

f.expectCreateReplicaSetAction(rs)
f.expectUpdateRolloutStatusAction(r)
f.expectUpdateReplicaSetAction(rs) // scale RS
f.expectPatchRolloutAction(r)
f.run(getKey(r, t))
}
Expand Down
11 changes: 6 additions & 5 deletions rollout/bluegreen.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,12 @@ func (c *rolloutContext) rolloutBlueGreen() error {
return err
}

// This must happen right after the new replicaset is created
err = c.reconcilePreviewService(previewSvc)
if err != nil {
return err
}

if replicasetutil.CheckPodSpecChange(c.rollout, c.newRS) {
return c.syncRolloutStatusBlueGreen(previewSvc, activeSvc)
}
Expand All @@ -39,11 +45,6 @@ func (c *rolloutContext) rolloutBlueGreen() error {
return err
}

err = c.reconcilePreviewService(previewSvc)
if err != nil {
return err
}

c.reconcileBlueGreenPause(activeSvc, previewSvc)

err = c.reconcileActiveService(activeSvc)
Expand Down
1 change: 1 addition & 0 deletions rollout/bluegreen_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ func TestBlueGreenCreatesReplicaSet(t *testing.T) {

f.expectCreateReplicaSetAction(rs)
servicePatchIndex := f.expectPatchServiceAction(previewSvc, rsPodHash)
f.expectUpdateReplicaSetAction(rs) // scale up RS
updatedRolloutIndex := f.expectUpdateRolloutStatusAction(r)
expectedPatchWithoutSubs := `{
"status":{
Expand Down
13 changes: 11 additions & 2 deletions rollout/canary_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -77,15 +77,19 @@ func TestCanaryRolloutBumpVersion(t *testing.T) {
f.replicaSetLister = append(f.replicaSetLister, rs1)

createdRSIndex := f.expectCreateReplicaSetAction(rs2)
updatedRSIndex := f.expectUpdateReplicaSetAction(rs2) // scale up RS
updatedRolloutRevisionIndex := f.expectUpdateRolloutAction(r2) // update rollout revision
updatedRolloutConditionsIndex := f.expectUpdateRolloutStatusAction(r2) // update rollout conditions
f.expectPatchRolloutAction(r2)
f.run(getKey(r2, t))

createdRS := f.getCreatedReplicaSet(createdRSIndex)
assert.Equal(t, int32(1), *createdRS.Spec.Replicas)
assert.Equal(t, int32(0), *createdRS.Spec.Replicas)
assert.Equal(t, "2", createdRS.Annotations[annotations.RevisionAnnotation])

updatedRS := f.getUpdatedReplicaSet(updatedRSIndex)
assert.Equal(t, int32(1), *updatedRS.Spec.Replicas)

updatedRollout := f.getUpdatedRollout(updatedRolloutRevisionIndex)
assert.Equal(t, "2", updatedRollout.Annotations[annotations.RevisionAnnotation])

Expand Down Expand Up @@ -475,6 +479,7 @@ func TestCanaryRolloutCreateFirstReplicasetNoSteps(t *testing.T) {
rs := newReplicaSet(r, 1)

f.expectCreateReplicaSetAction(rs)
f.expectUpdateReplicaSetAction(rs) // scale up rs
updatedRolloutIndex := f.expectUpdateRolloutStatusAction(r)
patchIndex := f.expectPatchRolloutAction(r)
f.run(getKey(r, t))
Expand Down Expand Up @@ -514,6 +519,7 @@ func TestCanaryRolloutCreateFirstReplicasetWithSteps(t *testing.T) {
rs := newReplicaSet(r, 1)

f.expectCreateReplicaSetAction(rs)
f.expectUpdateReplicaSetAction(rs) // scale up rs
updatedRolloutIndex := f.expectUpdateRolloutStatusAction(r)
patchIndex := f.expectPatchRolloutAction(r)
f.run(getKey(r, t))
Expand Down Expand Up @@ -559,12 +565,15 @@ func TestCanaryRolloutCreateNewReplicaWithCorrectWeight(t *testing.T) {
f.replicaSetLister = append(f.replicaSetLister, rs1)

createdRSIndex := f.expectCreateReplicaSetAction(rs2)
updatedRSIndex := f.expectUpdateReplicaSetAction(rs2)
updatedRolloutIndex := f.expectUpdateRolloutStatusAction(r2)
f.expectPatchRolloutAction(r2)
f.run(getKey(r2, t))

createdRS := f.getCreatedReplicaSet(createdRSIndex)
assert.Equal(t, int32(1), *createdRS.Spec.Replicas)
assert.Equal(t, int32(0), *createdRS.Spec.Replicas)
updatedRS := f.getUpdatedReplicaSet(updatedRSIndex)
assert.Equal(t, int32(1), *updatedRS.Spec.Replicas)

updatedRollout := f.getUpdatedRollout(updatedRolloutIndex)
progressingCondition := conditions.GetRolloutCondition(updatedRollout.Status, v1alpha1.RolloutProgressing)
Expand Down
5 changes: 4 additions & 1 deletion rollout/ephemeralmetadata_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ func TestSyncCanaryEphemeralMetadataInitialRevision(t *testing.T) {

f.expectUpdateRolloutStatusAction(r1)
idx := f.expectCreateReplicaSetAction(rs1)
f.expectUpdateReplicaSetAction(rs1)
_ = f.expectPatchRolloutAction(r1)
f.run(getKey(r1, t))
createdRS1 := f.getCreatedReplicaSet(idx)
Expand Down Expand Up @@ -75,8 +76,9 @@ func TestSyncBlueGreenEphemeralMetadataInitialRevision(t *testing.T) {

f.expectUpdateRolloutStatusAction(r1)
idx := f.expectCreateReplicaSetAction(rs1)
_ = f.expectPatchRolloutAction(r1)
f.expectPatchRolloutAction(r1)
f.expectPatchServiceAction(previewSvc, rs1.Labels[v1alpha1.DefaultRolloutUniqueLabelKey])
f.expectUpdateReplicaSetAction(rs1) // scale replicaset
f.run(getKey(r1, t))
createdRS1 := f.getCreatedReplicaSet(idx)
expectedLabels := map[string]string{
Expand Down Expand Up @@ -209,6 +211,7 @@ func TestSyncBlueGreenEphemeralMetadataSecondRevision(t *testing.T) {
f.expectUpdateRolloutStatusAction(r2) // Update Rollout conditions
rs2idx := f.expectCreateReplicaSetAction(rs2) // Create revision 2 ReplicaSet
f.expectPatchServiceAction(previewSvc, rs2PodHash) // Update preview service to point at revision 2 replicaset
f.expectUpdateReplicaSetAction(rs2) // scale revision 2 ReplicaSet up
f.expectListPodAction(r1.Namespace) // list pods to patch ephemeral data on revision 1 ReplicaSets pods`
podIdx := f.expectUpdatePodAction(&pod) // Update pod with ephemeral data
rs1idx := f.expectUpdateReplicaSetAction(rs1) // update stable replicaset with stable metadata
Expand Down
1 change: 1 addition & 0 deletions rollout/experiment_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -519,6 +519,7 @@ func TestRolloutDoNotCreateExperimentWithoutStableRS(t *testing.T) {
f.expectCreateReplicaSetAction(rs2)
f.expectUpdateRolloutAction(r2) // update revision
f.expectUpdateRolloutStatusAction(r2) // update progressing condition
f.expectUpdateReplicaSetAction(rs2) // scale replicaset
f.expectPatchRolloutAction(r1)
f.run(getKey(r2, t))
}
Expand Down
14 changes: 3 additions & 11 deletions rollout/sync.go
Original file line number Diff line number Diff line change
Expand Up @@ -159,13 +159,7 @@ func (c *rolloutContext) createDesiredReplicaSet() (*appsv1.ReplicaSet, error) {
Template: newRSTemplate,
},
}
allRSs := append(c.allRSs, newRS)
newReplicasCount, err := replicasetutil.NewRSNewReplicas(c.rollout, allRSs, newRS)
if err != nil {
return nil, err
}

newRS.Spec.Replicas = pointer.Int32Ptr(newReplicasCount)
newRS.Spec.Replicas = pointer.Int32Ptr(0)
// Set new replica set's annotation
annotations.SetNewReplicaSetAnnotations(c.rollout, newRS, newRevision, false)

Expand Down Expand Up @@ -250,12 +244,10 @@ func (c *rolloutContext) createDesiredReplicaSet() (*appsv1.ReplicaSet, error) {
return nil, err
}

if !alreadyExists && newReplicasCount > 0 {
if !alreadyExists {
revision, _ := replicasetutil.Revision(createdRS)
c.recorder.Eventf(c.rollout, record.EventOptions{EventReason: conditions.NewReplicaSetReason}, conditions.NewReplicaSetDetailedMessage, createdRS.Name, revision, newReplicasCount)
}
c.recorder.Eventf(c.rollout, record.EventOptions{EventReason: conditions.NewReplicaSetReason}, conditions.NewReplicaSetDetailedMessage, createdRS.Name, revision)

if !alreadyExists {
msg := fmt.Sprintf(conditions.NewReplicaSetMessage, createdRS.Name)
condition := conditions.NewRolloutCondition(v1alpha1.RolloutProgressing, corev1.ConditionTrue, conditions.NewReplicaSetReason, msg)
conditions.SetRolloutCondition(&c.rollout.Status, *condition)
Expand Down
7 changes: 5 additions & 2 deletions rollout/sync_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -304,14 +304,17 @@ func TestCanaryPromoteFull(t *testing.T) {
f.kubeobjects = append(f.kubeobjects, rs1)
f.replicaSetLister = append(f.replicaSetLister, rs1)

createdRS2Index := f.expectCreateReplicaSetAction(rs2) // create new ReplicaSet (surge to 10)
createdRS2Index := f.expectCreateReplicaSetAction(rs2) // create new ReplicaSet (size 0)
f.expectUpdateRolloutAction(r2) // update rollout revision
f.expectUpdateRolloutStatusAction(r2) // update rollout conditions
updatedRS2Index := f.expectUpdateReplicaSetAction(rs2) // scale new ReplicaSet to 10
patchedRolloutIndex := f.expectPatchRolloutAction(r2)
f.run(getKey(r2, t))

createdRS2 := f.getCreatedReplicaSet(createdRS2Index)
assert.Equal(t, int32(10), *createdRS2.Spec.Replicas) // verify we ignored steps
assert.Equal(t, int32(0), *createdRS2.Spec.Replicas)
updatedRS2 := f.getUpdatedReplicaSet(updatedRS2Index)
assert.Equal(t, int32(10), *updatedRS2.Spec.Replicas) // verify we ignored steps and fully scaled it

patchedRollout := f.getPatchedRolloutAsObject(patchedRolloutIndex)
assert.Equal(t, int32(2), *patchedRollout.Status.CurrentStepIndex) // verify we updated to last step
Expand Down
12 changes: 8 additions & 4 deletions test/e2e/functional_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -90,10 +90,12 @@ spec:
ExpectRevisionPodCount("2", 1).
ExpectRolloutEvents([]string{
"RolloutUpdated", // Rollout updated to revision 1
"NewReplicaSetCreated", // Created ReplicaSet abort-retry-promote-698fbfb9dc (revision 1) with size 1
"NewReplicaSetCreated", // Created ReplicaSet abort-retry-promote-698fbfb9dc (revision 1)
"ScalingReplicaSet", // Scaled up ReplicaSet abort-retry-promote-698fbfb9dc (revision 1) from 0 to 1
"RolloutCompleted", // Rollout completed update to revision 1 (698fbfb9dc): Initial deploy
"RolloutUpdated", // Rollout updated to revision 2
"NewReplicaSetCreated", // Created ReplicaSet abort-retry-promote-75dcb5ddd6 (revision 2) with size 1
"NewReplicaSetCreated", // Created ReplicaSet abort-retry-promote-75dcb5ddd6 (revision 2)
"ScalingReplicaSet", // Scaled up ReplicaSet abort-retry-promote-75dcb5ddd6 (revision 2) from 0 to 1
"RolloutStepCompleted", // Rollout step 1/2 completed (setWeight: 50)
"RolloutPaused", // Rollout is paused (CanaryPauseStep)
"ScalingReplicaSet", // Scaled down ReplicaSet abort-retry-promote-75dcb5ddd6 (revision 2) from 1 to 0
Expand Down Expand Up @@ -696,11 +698,13 @@ func (s *FunctionalSuite) TestBlueGreenUpdate() {
ExpectReplicaCounts(3, 6, 3, 3, 3).
ExpectRolloutEvents([]string{
"RolloutUpdated", // Rollout updated to revision 1
"NewReplicaSetCreated", // Created ReplicaSet bluegreen-7dcd8f8869 (revision 1) with size 3
"NewReplicaSetCreated", // Created ReplicaSet bluegreen-7dcd8f8869 (revision 1)
"ScalingReplicaSet", // Scaled up ReplicaSet bluegreen-7dcd8f8869 (revision 1) from 0 to 3
"RolloutCompleted", // Rollout completed update to revision 1 (7dcd8f8869): Initial deploy
"SwitchService", // Switched selector for service 'bluegreen' from '' to '7dcd8f8869'
"RolloutUpdated", // Rollout updated to revision 2
"NewReplicaSetCreated", // Created ReplicaSet bluegreen-5498785cd6 (revision 2) with size 3
"NewReplicaSetCreated", // Created ReplicaSet bluegreen-5498785cd6 (revision 2)
"ScalingReplicaSet", // Scaled up ReplicaSet bluegreen-5498785cd6 (revision 2) from 0 to 3
"SwitchService", // Switched selector for service 'bluegreen' from '7dcd8f8869' to '6c779b88b6'
"RolloutCompleted", // Rollout completed update to revision 2 (6c779b88b6): Completed blue-green update
})
Expand Down
2 changes: 1 addition & 1 deletion utils/conditions/conditions.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ const (
//NewReplicaSetMessage is added in a rollout when it creates a new replicas set.
NewReplicaSetMessage = "Created new replica set %q"
// NewReplicaSetDetailedMessage is a more detailed format message
NewReplicaSetDetailedMessage = "Created ReplicaSet %s (revision %d) with size %d"
NewReplicaSetDetailedMessage = "Created ReplicaSet %s (revision %d)"

// FoundNewRSReason is added in a rollout when it adopts an existing replica set.
FoundNewRSReason = "FoundNewReplicaSet"
Expand Down

0 comments on commit 7a0704a

Please sign in to comment.