Audit KCP codebase for re-entrancy & error handling of non-key space operations #11184
Labels
kind/bug
Categorizes issue or PR as related to a bug.
priority/important-soon
Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
There was a few interesting thread about error management for etcd's non-key space operations.
Serializable
etcd-io/etcd#18424 (comment)As a first reaction, I think in KCP we are generally ok, because errors reported by etcd are usually handled by re-entracy, which implies we re-assess the current state of the world before deciding the course of action.
But this is also a good chance to audit the code base for when we use non-key space operations, mostly remove member and forward leadership.
NOTE: add member/join is a slight different case, because we rely on kubeadm for it.
PS. I classified this as a bug because I did know exactly which kind to use 😅, but to be clear we are not aware of bugs it this area and this issue is to double check our codebase is robust enough to handle edge cases described in the comment above.
The text was updated successfully, but these errors were encountered: