-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
I am getting in a situation in which multiple events from normal watches on kinds and also a watch on an event channel form a race condition that generates an error when updating a resource.
So a reconcile cycle is triggered on a resource, and the resource is updated, this changes the resourceVersion
of it in etcd, but not in the local cache. Immediately after another reconcile cycle is triggered via the event channel.
The first thing that happens in the reconcile cycle is to look up the resource relative to the current request. This uses the manager provided client which finds the resource in the cache. The cache has not been updated yet and the resource still has the old resourceVersion. After this the reconcile cycle completes either and when the code tries to update the status it gets an error.
This is generally innocuous because the error will just trigger another reconcile cycle and eventually the resource will be correctly updated in the local cache. however it's annoying and confuses the users.
I'd like to know if it's possible to fix this. I suppose one way would be to not use the manager's client when looking up the resource and accepting the performance hit that comes from that, but I wanted to ask if there is a way for the framework to provide a solution to this. For example by enforcing a cache invalidation if an event is coming from the generic channel.