Skip to content

Improve patchHelper to allow atomically updating a set of fields #11401

Open
@sbueringer

Description

@sbueringer

Today patchHelper calculates patches by comparing two objects (let's call them original & desired). The patch that is send to the apiserver then only contains the delta.

This works fine as long as the original object is up-to-date. The problems start when the original object is outdated, e.g. because we read it from the controller-runtime cache and the object was stale. Please note that when patching status & spec the patch helper does not use optimistic locking.

An example:

original KCP object (stale)

status:
  replicas: 3
  readyReplicas: 3

desired KCP object

status:
  replicas: 4
  readyReplicas: 3

=> patchHelper will send a patch that sets .status.replicas = 4. readyReplicas is not included because patchHelper couldn't see a delta.

But the actual KCP object was:

status:
  replicas: 3
  readyReplicas: 2

So after patching we will end up with the following KCP object:

status:
  replicas: 4
  readyReplicas: 2

At no point did KCP calculate this combination of counter fields, but we still end up with this state in the apiserver because the patch calculation was based on a stale object. This can happen with various values for the counter fields and it gets problematic when other controllers are reading the KCP status and then making decisions based on it, e.g. they might think KCP has been already entirely upgraded if all replica fields are 3, but this might not actually be the case.

This might sound like a contrived example, but we actually had this problem and it's super hard to debug. The Cluster topology controller read KCP status, determined KCP was already entirely upgraded and then triggered worker upgrades, even though the KCP ugprade was still in progress.

One way to solve this would be to not read KCP from the cache and read it from the apiserver directly, but this is not great for scalability and puts more load on the kube-apiserver.

My proposal would be to add an option to patchHelper to ensure a certain set of fields is always part of the patch, if one of the fields changed. Based on the example above this option would be used to ensure replicas & readyReplicas are always patched together.

Somewhat related to ongoing work in #11105

Metadata

Metadata

Assignees

Labels

kind/featureCategorizes issue or PR as related to a new feature.priority/important-soonMust be staffed and worked on either currently, or very soon, ideally in time for the next release.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions