Documentation: Best practices for zero-downtime plugin updates

I have an NRI plugin running as a DaemonSet. Ideally, I would like to set `spec.updateStrategy.rollingUpdate.maxSurge` to a positive number, and `spec.updateStrategy.rollingUpdate.maxUnavailable` to zero to have "make before break" semantics where on each node the new pod starts up and becomes ready, before the old pod is terminated.

However, for my NRI plugin, it's not safe to have two instances running at the same time acting on the same container, since this would result in an action being performed twice on pod startup rather than once. So for now, I am fully terminating the old pod before starting the new pod to prevent overall. And when the new pod comes up, it has to "catch up" to process any events that were missed.

It would be helpful if we could document any known patterns for achieving this kind of zero-downtime update scenario for an NRI plugin, in a way that avoids duplicate processing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Documentation: Best practices for zero-downtime plugin updates #167

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Documentation: Best practices for zero-downtime plugin updates #167

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions