Improve patchHelper to allow atomically updating a set of fields #11401
Labels
kind/feature
Categorizes issue or PR as related to a new feature.
needs-triage
Indicates an issue or PR lacks a `triage/foo` label and requires one.
priority/important-soon
Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Today patchHelper calculates patches by comparing two objects (let's call them original & desired). The patch that is send to the apiserver then only contains the delta.
This works fine as long as the original object is up-to-date. The problems start when the original object is outdated, e.g. because we read it from the controller-runtime cache and the object was stale. Please note that when patching status & spec the patch helper does not use optimistic locking.
An example:
original KCP object (stale)
desired KCP object
=> patchHelper will send a patch that sets
.status.replicas = 4
.readyReplicas
is not included because patchHelper couldn't see a delta.But the actual KCP object was:
So after patching we will end up with the following KCP object:
At no point did KCP calculate this combination of counter fields, but we still end up with this state in the apiserver because the patch calculation was based on a stale object. This can happen with various values for the counter fields and it gets problematic when other controllers are reading the KCP status and then making decisions based on it, e.g. they might think KCP has been already entirely upgraded if all replica fields are
3
, but this might not actually be the case.This might sound like a contrived example, but we actually had this problem and it's super hard to debug. The Cluster topology controller read KCP status, determined KCP was already entirely upgraded and then triggered worker upgrades, even though the KCP ugprade was still in progress.
One way to solve this would be to not read KCP from the cache and read it from the apiserver directly, but this is not great for scalability and puts more load on the kube-apiserver.
My proposal would be to add an option to patchHelper to ensure a certain set of fields is always part of the patch, if one of the fields changed. Based on the example above this option would be used to ensure
replicas
&readyReplicas
are always patched together.Somewhat related to ongoing work in #11105
The text was updated successfully, but these errors were encountered: