You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After upgrading from 1.6.6 to 1.7.1 we've noticed that occasionally when deploying and promoting releases the old ReplicaSet does not get scaled down. The old pods keep running indefinitely. Though they do not receive any traffic as active/preview services correctly point to a new ReplicaSet
To Reproduce
The issue is hard to replicate since it happens sporadically.
Make a new revision of a rollout with blue/green strategy and ephemeral labels configured, auto promotion disabled.
It looks like the controller sets scale-down-deadline annotation correctly to 30 seconds forward. Few lines below the ReplicaSet gets patched without changing scale-down-deadline annotation. But then after few more lines it makes another patch operation with scale-down-deadline set to an empty string, which I think is the problem.
From this point of time, the controller keeps setting the deadline date and resetting it to the empty string.
time="2024-08-01T15:13:06Z" level=info msg="Set 'scale-down-deadline' annotation on 'ui-widgets-684cf6478b' to 2024-08-01T15:13:36Z (30s)" namespace=default rollout=ui-widgets
(...)
time="2024-08-01T15:13:06Z" level=info msg="Conflict when updating replicaset ui-widgets-684cf6478b, falling back to patch" namespace=default rollout=ui-widgets
time="2024-08-01T15:13:06Z" level=info msg="Patching replicaset with patch: {\"metadata\":{\"annotations\":{\"rollout.argoproj.io/desired-replicas\":\"1\",\"rollout.argoproj.io/revision\":\"59\"},\"labels\":{\"rollouts-pod-template-hash\":\"684cf6478b\"}},\"spec\":{\"replicas\":1,\"selector\":{\"matchLabels\":{\"rollouts-pod-template-hash\":\"684cf6478b\"}},\"template\":{\"metadata\":{\"labels\":{\"app\":\"ui-widgets\",\"release\":\"ui-widgets\",\"rollouts-pod-template-hash\":\"684cf6478b\"}}}}}" namespace=default rollout=ui-widgets
(...)
time="2024-08-01T15:13:06Z" level=info msg="Conflict when updating replicaset ui-widgets-684cf6478b, falling back to patch" namespace=default rollout=ui-widgets
time="2024-08-01T15:13:06Z" level=info msg="Patching replicaset with patch: {\"metadata\":{\"annotations\":{\"rollout.argoproj.io/desired-replicas\":\"1\",\"rollout.argoproj.io/revision\":\"59\",\"scale-down-deadline\":\"\"},\"labels\":{\"rollouts-pod-template-hash\":\"684cf6478b\"}},\"spec\":{\"replicas\":1,\"selector\":{\"matchLabels\":{\"rollouts-pod-template-hash\":\"684cf6478b\"}},\"template\":{\"metadata\":{\"labels\":{\"app\":\"ui-widgets\",\"release\":\"ui-widgets\",\"rollouts-pod-template-hash\":\"684cf6478b\"}}}}}" namespace=default rollout=ui-widgets
I had a look at the recent changes to see what might cause this to happen. There was a new function added that is responsible for patching the replica set - updateReplicaSetFallbackToPatch
Describe the bug
After upgrading from
1.6.6
to1.7.1
we've noticed that occasionally when deploying and promoting releases the old ReplicaSet does not get scaled down. The old pods keep running indefinitely. Though they do not receive any traffic as active/preview services correctly point to a new ReplicaSetTo Reproduce
The issue is hard to replicate since it happens sporadically.
Version
1.7.1
Logs
controller-deployment.log
It looks like the controller sets
scale-down-deadline
annotation correctly to 30 seconds forward. Few lines below the ReplicaSet gets patched without changingscale-down-deadline
annotation. But then after few more lines it makes another patch operation withscale-down-deadline
set to an empty string, which I think is the problem.From this point of time, the controller keeps setting the deadline date and resetting it to the empty string.
I had a look at the recent changes to see what might cause this to happen. There was a new function added that is responsible for patching the replica set -
updateReplicaSetFallbackToPatch
I suspect the problem lies on the following line
argo-rollouts/rollout/controller.go
Line 990 in 708db68
It copies the annotation from
rs.Labels
instead ofrs.Annotations
Though, it's not clear to me what is the exact flow that triggers this issue intermittently.
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.
The text was updated successfully, but these errors were encountered: