-
Notifications
You must be signed in to change notification settings - Fork 650
Ensure the update_downloads job doesn't run concurrently #2266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure the update_downloads job doesn't run concurrently #2266
Conversation
match diesel::select(pg_advisory_unlock(self.key)).get_result(self.conn) { | ||
Ok(true) => (), | ||
Ok(false) => println!( | ||
"Error: job advisory lock for key {} was not locked", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be a panic, to forbid the callback from releasing the lock on its own.
"Error: job advisory lock for key {} was not locked", | ||
self.key | ||
), | ||
Err(err) => println!("Error unlocking advisory lock (key: {}): {}", self.key, err), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar here. We could do something (or just panic) to ensure the connection is closed and the lock unlocked.
☔ The latest upstream changes (presumably #2269) made this pull request unmergeable. Please resolve the merge conflicts. |
This is an improved implementation of rust-lang#2157. The previous design relied on a transaction based lock to manage the lifetime of the lock. The wrapper transaction caused the `update_downloads` job to interfere with incoming download requests, and the changes had to be reverted. This implementation uses a session lock which is automatically released even if the callback panics. If multiple instances of the `update_downloads` job are run concurrently then it is possible to over count downloads, at least temporarily. The first job selects all matching `version_downloads` and later uses those values to calculate how many downloads to add to `versions` and `crates`. If a second job is run, it would select some rows from `version_downloads` that were already queued for processing by the first task. If an over count were to occur, the next time the job is run it should calculate a negative adjustment and correct the situation. There's no point in doing extra work and if we eventually need concurrency we should built that out intentionally. Therefore, this commit wraps the entire job in a transaction and obtains an transaction level advisory lock from the database. If the lock has already been taken the job will fail and will be retried by swirl. If the duration of this job begins to approach the scheduling interval, then we will want to increase that interval to avoid triggering alerts.
0d62fe1
to
4be63c4
Compare
☔ The latest upstream changes (presumably #2328) made this pull request unmergeable. Please resolve the merge conflicts. |
Closing as superseded by #2539. We've now seen that fix work correctly several times in production. |
This is an improved implementation of #2157. The previous design
relied on a transaction based lock to manage the lifetime of the lock.
The wrapper transaction caused the
update_downloads
job to interferewith incoming download requests, and the changes had to be reverted.
This implementation uses a session lock which is automatically released
even if the callback panics.
r? @sgrif