You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Occasionally an operator may get stuck in a retry loop. If many handlers are failing with retryable errors then a large number of events will be generated putting stress on etcd and making the output of kubectl get events very hard to work with.
Expected Behavior
Duplicate or "near duplicate" events are aggregated.
$ oc get events
LASTSEEN FIRSTSEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
20m 20m 1234 my-custom-resource MyCustomResource Error HandlerRetryError kopf Handler 'on_delete' failed. Will retry. ['errors']
Actual Behavior
Every event generated by kopf is a new event in kubernetes which, if I understand correctly, puts undue load on etcd.
Write any handler that gets stuck in a retry loop and observe kubectl get events.
Install any CRD into the cluster using a handler like the one below. (In this case you'd have to invoke the
handler by creating a new "MyCustomResource")
mzizzi Do you mean the in-memory event accumulation, aggregation, and then posting only the aggregated events every few seconds/minutes/events?
Or is this also about the event patching with "lastTimestamp", "count", and some other field updates? — Which implies one API request per event anyway, just PATCH rather that POST, but will make kubectl get events output shorter.
nolar Good question. I hadn't made the distinction when I originally posted the question.
After reading more into how the go client works.. It uses a combination of rate-limiting, in memory caching, and event patching. That solves both potential issues that you highlighted:
Load introduced by many POST/PATCH requests for events
Load due to excessive amounts of events being stored in kube
Incorporating some (or all!) of these features will help us create well-behaved Operators.
The text was updated successfully, but these errors were encountered:
First off, thanks for a getting this framework together. I've enjoyed hacking around :-)
This might be more of a request for the python kube client as it appears to be lacking event aggregation functionality similar to that found in the go client.
Occasionally an operator may get stuck in a retry loop. If many handlers are failing with retryable errors then a large number of events will be generated putting stress on etcd and making the output of
kubectl get events
very hard to work with.Expected Behavior
Duplicate or "near duplicate" events are aggregated.
Actual Behavior
Every event generated by kopf is a new event in kubernetes which, if I understand correctly, puts undue load on etcd.
Steps to Reproduce the Problem
Write any handler that gets stuck in a retry loop and observe
kubectl get events
.handler by creating a new "MyCustomResource")
Specifications
mzizzi Do you mean the in-memory event accumulation, aggregation, and then posting only the aggregated events every few seconds/minutes/events?
Or is this also about the event patching with "lastTimestamp", "count", and some other field updates? — Which implies one API request per event anyway, just PATCH rather that POST, but will make
kubectl get events
output shorter.nolar Good question. I hadn't made the distinction when I originally posted the question.
After reading more into how the go client works.. It uses a combination of rate-limiting, in memory caching, and event patching. That solves both potential issues that you highlighted:
Incorporating some (or all!) of these features will help us create well-behaved Operators.
The text was updated successfully, but these errors were encountered: