Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replies.file_counter unique constraint fails and crashes app #489

Closed
sssoleileraaa opened this issue Jul 23, 2019 · 4 comments · Fixed by #640
Closed

replies.file_counter unique constraint fails and crashes app #489

sssoleileraaa opened this issue Jul 23, 2019 · 4 comments · Fixed by #640
Labels
bug Something isn't working
Milestone

Comments

@sssoleileraaa
Copy link
Contributor

sssoleileraaa commented Jul 23, 2019

Description

Client crashes after sending replies that timeout and then pressing refresh wehn back online.

STR

  1. log into the client in Qubes
  2. cut connection to server
  3. send a few replies and wait until one of them times out
  4. reconnect to server while the next reply is being processed
  5. immediately send another reply and click refresh and the application crashes with UNIQUE constraint failed: replies.source_id, replies.file_counter
@sssoleileraaa
Copy link
Contributor Author

the stack trace is on qubes, so once i find a transfer device i will copy it over and paste it in this ticket, but for now, i see that this is happening during update_replies in storage.py when a sync is successful

@eloquence eloquence added the bug Something isn't working label Jul 24, 2019
@eloquence eloquence added this to the 0.2.0alpha milestone Jul 24, 2019
@redshiftzero
Copy link
Contributor

redshiftzero commented Jul 24, 2019

This could be arising due to the following race:

  1. Application sends reply A (reply send job added to the queue).
  2. Reply A saved server-side. Server sets filename (which is used after a sync to set the file_counter).
  3. Timeout occurs before 201 response. Reply job saved as last_job.
  4. Sync starts to occur. It gets the list of new replies to add to the database.
  5. Queue starts running last_job.
  6. Job execution gets to https://github.com/freedomofpress/securedrop-client/blob/master/securedrop_client/api_jobs/uploads.py#L43 and the reply is added to the database with file_counter X.
  7. Meanwhile, the sync execution gets to update_replies, and tries to add the same reply with file_counter X. Unique constraint is hit.

I think this can still occur in the priority queue case (though is a bit hard to reproduce because the reply needs to be saved server-side), because metadata syncs will be prioritized higher than the sending of replies.

@redshiftzero
Copy link
Contributor

redshiftzero commented Jul 24, 2019

Here's a proposed resolution that is relevant also for fixing this other bug:

  1. Reply send job first creates reply row in database with UUID A if it doesn't exist. It sets pending=True for this row. The pending field would be a new field in the replies table, which for all items that are downloaded from the server would be set to False (because if it's from the server, it's not pending). After the priority queue is merged and the metadata sync job is implemented in Create a MetadataSyncJob #461, all metadata sync and reply fetch jobs will be completing before the reply send job executes, which means no race will occur.
  2. If the reply already existed in the database, job exits success.
  3. Else, send reply job action to server. Job updates pending=True to pending=False in the corresponding row.

At each one of these stages, we should be updating the GUI logic to update based on the state of the database (much cleaner all around and also incidentally fixes #294).

So the proposed actions are:

@redshiftzero
Copy link
Contributor

Reflecting on this, #461 will address the sync racing the queue actions. That said, we should also keep in mind that there's also the possibility that the source sends a message M at almost the same time as the reply gets saved, i.e. before the download job to fetch M is even added to the queue. This won't produce an exception like seen in this bug, but it does mean that the ordering of the source's conversation will not be unique. This means that we need to be sure to update file_counter of the reply once saved such that the ordering of the source's conversation is unique and consistent across all clients and the server. Once the behavior described here is implemented we will handle this situation also (case 4 in that comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants