Skip to content

Conversation

@Vincent-lau
Copy link
Contributor

This forces the failure on a host that is trying to perform corosync upgrade. There are ways to recover: if the failure happens early, before the cluster is created in the DB, then a recreate ought to fix the problem. This happens when the corosync upgrade fails on the coordinator.

If the failure happens after the cluster is created on a pool member, then a pool-resync should help retry this upgrade.

Hopefully this can simulate some of the failure paths, but is by no means exhaustive. Other more complicated failures are not easily recoverable and therefore not simulated for now.

This forces the failure on a host that is trying to perform corosync
upgrade. There are ways to recover: if the failure happens early, before
the cluster is created in the DB, then a recreate ought to fix the
problem. This happens when the corosync upgrade fails on the
coordinator.

If the failure happens after the cluster is created on a pool member,
then a `pool-resync` should help retry this upgrade.

Hopefully this can simulate some of the failure paths, but is by no
means exhaustive. Other more complicated failures are not easily
recoverable and therefore not simulated for now.

Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>
@Vincent-lau
Copy link
Contributor Author

manually tested, looks good, merge it now

@Vincent-lau Vincent-lau merged commit d127cdd into xapi-project:master May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants