Skip to content

Commit 86ff7ad

Browse files
committed
Auto merge of #3414 - pietroalbini:deadlock, r=jtgeibel
Prevent transaction deadlocks in DownloadsCounter While analyzing the behavior of `DownloadsCounter` (introduced in #3413) in production, we noticed that there were a few cases of transaction deadlocks tonight in the background thread persisting the download counts. I did some investigation on the cause of this and came up with this fix. As [recommended by the PostgreSQL docs](https://www.postgresql.org/docs/11/explicit-locking.html#LOCKING-DEADLOCKS), to avoid deadlocks when multiple transactions affect multiple rows in a table, the affected keys needs to be modified in a consistent order across the transactions. We implement that by sorting by `version_id` the items we're going to insert. <details><summary>Local reproduction instructions</summary> The exact error we saw in production can be reproduced by opening two instances of `psql` locally and running those two set of queries in each `psql` (without running `COMMIT`): ```sql -- First instance of psql BEGIN; INSERT INTO version_downloads (version_id) VALUES (1) ON CONFLICT (version_id, date) DO UPDATE SET downloads = excluded.downloads + 1; -- Second instance of psql BEGIN; INSERT INTO version_downloads (version_id) VALUES (2) ON CONFLICT (version_id, date) DO UPDATE SET downloads = excluded.downloads + 1; ``` Then, once the commands are sent to both instances of `psql`, sending these other commands will cause the deadlock: ```sql -- First instance of psql INSERT INTO version_downloads (version_id) VALUES (2) ON CONFLICT (version_id, date) DO UPDATE SET downloads = excluded.downloads + 1; -- Second instance of psql INSERT INTO version_downloads (version_id) VALUES (1) ON CONFLICT (version_id, date) DO UPDATE SET downloads = excluded.downloads + 1; ``` Instead, if both transactions try to insert first `1` and then `2` the first `psql` will work fine, and the second `psql` will wait for the first `psql` to `COMMIT` before applying the changes. That's the behavior we want and it's the one this PR implements. </details> r? `@jtgeibel`
2 parents 4019c27 + 426c635 commit 86ff7ad

File tree

1 file changed

+22
-1
lines changed

1 file changed

+22
-1
lines changed

src/downloads_counter.rs

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -126,10 +126,31 @@ impl DownloadsCounter {
126126
counted_downloads += count;
127127
counted_versions += 1;
128128

129-
to_insert.push((version_id.eq(*key), downloads.eq(count as i32)));
129+
to_insert.push((*key, count));
130130
}
131131

132132
if !to_insert.is_empty() {
133+
// The rows we're about to insert need to be sorted to avoid deadlocks when multiple
134+
// instances of crates.io are running at the same time.
135+
//
136+
// In PostgreSQL a transaction modifying a row locks that row until the transaction is
137+
// committed. Multiple transactions inserting rows into a table could end up
138+
// deadlocking each other though: PostgreSQL will detect that deadlock, abort one of
139+
// the transactions and allow the other one to continue. We don't want that to happen,
140+
// as we'd lose the downloads from the aborted transaction.
141+
//
142+
// Ensuring the rows are inserted in a consistent order (in our case by sorting them by
143+
// the version ID) will prevent deadlocks from occuring. For more information:
144+
//
145+
// https://www.postgresql.org/docs/11/explicit-locking.html#LOCKING-DEADLOCKS
146+
//
147+
to_insert.sort_by_key(|(key, _)| *key);
148+
149+
let to_insert = to_insert
150+
.into_iter()
151+
.map(|(key, count)| (version_id.eq(key), downloads.eq(count as i32)))
152+
.collect::<Vec<_>>();
153+
133154
diesel::insert_into(version_downloads)
134155
.values(&to_insert)
135156
.on_conflict((version_id, date))

0 commit comments

Comments
 (0)