-
Notifications
You must be signed in to change notification settings - Fork 604
OCPBUGS-60098: podman-etcd: prevent last active member from leaving the etcd member list #2100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
OCPBUGS-60098: podman-etcd: prevent last active member from leaving the etcd member list #2100
Conversation
|
Can one of the project admins check and authorise this run please: https://ci.kronosnet.org/job/resource-agents/job/resource-agents-pipeline/job/PR-2100/1/input |
|
Is it possible that we want to add a random 0-1 sec delay to account for the case of the two agents leaving at the same time? |
9b7568f to
06065c7
Compare
|
Can one of the project admins check and authorise this run please: https://ci.kronosnet.org/job/resource-agents/job/resource-agents-pipeline/job/PR-2100/2/input |
06065c7 to
2e890df
Compare
|
Can one of the project admins check and authorise this run please: https://ci.kronosnet.org/job/resource-agents/job/resource-agents-pipeline/job/PR-2100/3/input |
This will be a different way to fix the same problem, indeed. Option A (Delayed Member Removal): During a simultaneous graceful shutdown, nodes would introduce a delay, ensuring only, and exactly, one node removes itself from the etcd cluster, which bumps the revision on the other etcd member. On restart, Pacemaker ensures both nodes are online before starting the agents (just like Option B). As one of the agents will have higher revision, this option effectively forces the next etcd cycle to create a new cluster. Option B (No Member Removal): Both nodes would gracefully stop their etcd processes without explicitly leaving the cluster membership. On restart, Pacemaker ensures both nodes are online before starting the agents, just like Option A, however we can't predict which agent will have higher revision, or even if the revisions are equal. I think both options are good. I only have a slight preference over Option B (the current one). Option A slows down the stop procedure, requires some more logic to decide the delay and, as it depends on the network connection, might still fail some times. Option B looks simpler, but it will stress more the "restart normally" (when revisions are the same) branch which is the hardest to test right now, and so, I believe, is less tested. |
5a8a9a0 to
ce5eff1
Compare
|
Can one of the project admins check and authorise this run please: https://ci.kronosnet.org/job/resource-agents/job/resource-agents-pipeline/job/PR-2100/7/input |
ce5eff1 to
f049fb8
Compare
|
Can one of the project admins check and authorise this run please: https://ci.kronosnet.org/job/resource-agents/job/resource-agents-pipeline/job/PR-2100/8/input |
…he etcd member list When stopping etcd instances, simultaneous member removal from both nodes can corrupt the etcd Write-Ahead Log (WAL). This change implements a two-part solution: 1. Concurrent stop protection: When multiple nodes are stopping, the alphabetically second node delays its member removal by 10 seconds. This prevents simultaneous member list updates that can corrupt WAL. 2. Last member detection: Checks active resource count after any delay. If this is the last active member, skips member removal to avoid leaving an empty cluster. Additionally, reorders podman_stop() to clear the member_id attribute after leaving the member list, ensuring the attribute reflects actual cluster state during shutdown.
f049fb8 to
578e6d9
Compare
|
Can one of the project admins check and authorise this run please: https://ci.kronosnet.org/job/resource-agents/job/resource-agents-pipeline/job/PR-2100/9/input |
When stopping an etcd instance, the agent should not leave the member list if it's the last active agent in the cluster. Leaving the member list in this scenario can cause WAL corruption.
This change introduces a check for the number of active resources before attempting to leave the member list. If no other active resources are found, the agent will log a message and skip the leave operation.
NOTE: the check on
standalone_nodemight not be enough if both agents stop roughly at the same time, hence none of them has enough time to set the attribute.Fixes: OCPBUGS-60098