Fix recovery when terms are accidentally empty #3099
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a fix for an issue that occurs when shutting down
a node (via SIGTERM) while the queues and more specifically
the queue index is recovering. When that happens
rabbit_recovery_terms has already started, and when
it starts it calls dets:open_file/2 which creates an
empty recovery.dets file. After the node is down and
restarted again, the node thinks the shutdown was clean
because the recovery file is there, except it is empty
and therefore the queues have lost all their state.
This results in RabbitMQ thinking there are 0 messages
in all classic queues.
To avoid this issue, we consider a shutdown to be dirty
in the case where we have a recovery file BUT we do not
find our state in the recovery terms.
To reliably reproduce the issue this fixes:
Start a node
Fill it with many messages (800k is more than enough)
Wait a little and then kill the node via Ctrl+C twice
(to force dirty recovery next start)
Start the node again
While it says "Starting broker", after waiting
about 5 seconds, send a SIGTERM (killall beam.smp)
to shutdown the node "cleanly"
Start the node again
Management will show 0 messages in all classic queues
Types of Changes
What types of changes does your code introduce to this project?
Put an
x
in the boxes that applyChecklist
Put an
x
in the boxes that apply. You can also fill these out after creatingthe PR. If you're unsure about any of them, don't hesitate to ask on the
mailing list. We're here to help! This is simply a reminder of what we are
going to look for before merging your code.
CONTRIBUTING.md
document