-
Notifications
You must be signed in to change notification settings - Fork 86
Description
I have an NRI plugin running as a DaemonSet. Ideally, I would like to set spec.updateStrategy.rollingUpdate.maxSurge to a positive number, and spec.updateStrategy.rollingUpdate.maxUnavailable to zero to have "make before break" semantics where on each node the new pod starts up and becomes ready, before the old pod is terminated.
However, for my NRI plugin, it's not safe to have two instances running at the same time acting on the same container, since this would result in an action being performed twice on pod startup rather than once. So for now, I am fully terminating the old pod before starting the new pod to prevent overall. And when the new pod comes up, it has to "catch up" to process any events that were missed.
It would be helpful if we could document any known patterns for achieving this kind of zero-downtime update scenario for an NRI plugin, in a way that avoids duplicate processing.