-
Notifications
You must be signed in to change notification settings - Fork 4k
Backport test fixes from main to v4.0.x
#13378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
63f1d8a to
5e1538b
Compare
[Why] The `force_reset` command simply removes local files on disk for the local node. In the case of Ra, this can't work because the rest of the cluster does not know about the forced-reset node. Therefore the leader will continue to send `append_entry` commands to the reset node. If that forced-reset node restarts and receives these messages, it will either join the cluster again (because it's on an older Raft term) or it will hit an assertion and exit (because it's on the same Raft term). [How] Given we can't really support this scenario and it has little value, the command will now return an error if someone attemps a `force_reset` with a node running Khepri. This also deprecates the command: once Mnesia support is removed, the command will be removed at the same time. This is noted in the rabbitmqctl.8 manpage. (cherry picked from commit c78aec7)
[Why] We hit some transient errors with the previous order when doing mixed-version testing. Swapping the nodes seems to fix the problem. (cherry picked from commit 5cbda4c)
... are being used at the same time. [Why] Depending on which node clusters with which, a node running an older version of the Khepri Ra machine may not be able to apply Ra commands and could be stuck. There is no real solution and this clearly an unsupported scenario. An old node won't always be able to join a newer cluster. [How] In the testsuites, we skip clustering tests if we detect that multiple Khepri Ra machine versions are being used. (cherry picked from commit 1f1a135)
[Why] During mixed-version testing, the old node might not be able to join or rejoin a cluster if the other nodes run a newer Khepri machine version. [How] The old node is used as the cluster seed node and is never touched otherwise. Other nodes are restarted or join the cluster later. (cherry picked from commit e76233a)
… with Khepri [Why] This test plays with the Mnesia database explicitly. (cherry picked from commit f088c4f)
[Why] We see nodes trying to use busy ports in CI from time to time. (cherry picked from commit e76c227)
... in retry_if_coordinator_unavailable(). (cherry picked from commit ee0b5b5)
(cherry picked from commit b7c9e64)
(cherry picked from commit 64b68e5)
This may help debug nodes that try to open busy ports. (cherry picked from commit a5f30ea)
bfa8721 to
02c7b04
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The fixes come from the following pull requests:
force_resetcommand is unsupported with Khepri #13217formattest #13234parallel-ct-set-*#13329They are backported together to reduce the number of pull requests and the load on CI. Also, CI would likely fail a lot more with one of the fixes missing.
There is still work to do to fix all test flakes, but backporting these will already bring an improvement for the
v4.0.xbranch.