-
Notifications
You must be signed in to change notification settings - Fork 80
Remove race condition - Fixes 2143 #2203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2203 +/- ##
==========================================
- Coverage 92.04% 92.01% -0.03%
==========================================
Files 191 191
Lines 19515 19543 +28
==========================================
+ Hits 17963 17983 +20
- Misses 1552 1560 +8
Continue to review full report at Codecov.
|
@josenavas, just FYI, the PR actually has failed tests. |
Ready for review! |
AND processing_job_status != %s""" | ||
qdb.sql_connection.TRN.add(sql, [self.id, 'waiting']) | ||
AND processing_job_status NOT IN %s""" | ||
sql_args = [self.id, ('waiting', 'error')] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
based on the description it sounds like here it should only be waiting, right? If this is incorrect, what happens with erred jobs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a comment explaining this, but basically here we want to wait until all validators are completed. Validator jobs can be in two states when completed: 'waiting' -> success and 'error' -> error
qiita_db/test/test_processing_job.py
Outdated
@@ -476,7 +483,8 @@ def test_complete_success(self): | |||
|
|||
obsjobs = set(self._get_all_job_ids()) | |||
|
|||
self.assertEqual(len(obsjobs), len(alljobs) + 1) | |||
# The complete submits the release validators job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add an explanation of why + 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments addressed - I would, however, hold on merging as I would like to deploy this in the test environment first to make sure that everything is working as expected - since I could only perform limited testing on my machine.
AND processing_job_status != %s""" | ||
qdb.sql_connection.TRN.add(sql, [self.id, 'waiting']) | ||
AND processing_job_status NOT IN %s""" | ||
sql_args = [self.id, ('waiting', 'error')] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a comment explaining this, but basically here we want to wait until all validators are completed. Validator jobs can be in two states when completed: 'waiting' -> success and 'error' -> error
qiita_db/test/test_processing_job.py
Outdated
@@ -476,7 +483,8 @@ def test_complete_success(self): | |||
|
|||
obsjobs = set(self._get_all_job_ids()) | |||
|
|||
self.assertEqual(len(obsjobs), len(alljobs) + 1) | |||
# The complete submits the release validators job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I deployed this in the test environment and executed ~10 deblur jobs. All of them succeeded and I haven't seen the race condition occurring again. I'm pretty confident that the issue has been solved because conceptually the race condition has been removed. |
Fixes #2143
Adds a new internal command:
release_validators
. The validators themselves no longer try to free up the parent job. Therelease_validators
is a command that performs an active polling checking for the jobs to complete. It is automatically submitted to the queue after the validators have been submitted.