Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add queue clearing logic to the Redis migrations #679

Merged
merged 1 commit into from
Oct 31, 2022

Conversation

svix-daniel
Copy link
Contributor

@svix-daniel svix-daniel commented Oct 26, 2022

As a followup to #671, this change adds logic like the Redis migration functions which clears the queue of acknowledged, stale tasks up to the lowest pending ID. This is run every thirty seconds until there has been a successful trimming of tasks.

The following is a brief explanation of how Redis streams work:

If you XREADGROUP it'll be added to the PEL (pending entries list) and thus show up in XPENDING until it has been XACKed. But it'll never show up in XREADGROUP with that specific consumer group again.

Once XACK ed it removes the reference in the PEL, and will show up in neither XPENDING or XREADGROUP.

It will however remain in the stream and show up via XREAD regardless of whether it is ACKed, but we don't do this anywhere. We instead use XAUTOCLAIM on items that have been read but not XACKed for more than 45 seconds and reinsert them into the queue.

While it doesn't show up in XREADGROUP, it's still there consuming memory until the item has been XDELed from the stream.

The bug was that we would XACK tasks, but not XDEL them at any point up until the changes of #671. However, #671 only fixes tasks that are acknowledged after updating the server instance. This PR is meant to clear stale queue tasks that were acknowledged before updating.

The approach here is to use XPENDING to get the lowest ID that has been read but not ACKed (knowing the server reads from the stream by order of entry), and to XTRIM up to that point exclusive. Should there be no items in the XPENDING return, then the logic will loop with a 30 second delay until there are entries in the PEL at the time of reading. This is an imperfect solution, but it is still likely that the PEL will have at least one entry within a reasonable amount of time given a high-load queue.

Given there isn't high load and XPENDING never returns a non-empty array, it is assumed that this bug has not consumed enough memory to make clearing the queue automatically a great necessity.

@svix-daniel svix-daniel force-pushed the daniel/clean-up-queue branch 5 times, most recently from b4fa3a0 to 055f369 Compare October 26, 2022 17:58
svix-james
svix-james previously approved these changes Oct 28, 2022
@svix-daniel svix-daniel merged commit 301c262 into main Oct 31, 2022
@svix-daniel svix-daniel deleted the daniel/clean-up-queue branch October 31, 2022 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants