Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(core): resolve race in IAsyncResult.wait() #487

Merged
merged 9 commits into from
Nov 10, 2017

Conversation

jgh-ds
Copy link
Contributor

@jgh-ds jgh-ds commented Nov 6, 2017

closes #485

jeffwidman
jeffwidman previously approved these changes Nov 6, 2017
Copy link
Member

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

Would you be willing to add a test for this? Looking at your bug.py, this looks very straightforward to test...

@jgh-ds
Copy link
Contributor Author

jgh-ds commented Nov 6, 2017

I thought about adding a test, but I didn't do it because a failure would cause the test code to hang forever. However, it would be very simple to do. Should I add one?

@jgh-ds jgh-ds closed this Nov 6, 2017
@jgh-ds jgh-ds reopened this Nov 6, 2017
@jgh-ds
Copy link
Contributor Author

jgh-ds commented Nov 6, 2017

I didn't mean to close it.

def wait_for_val():
try:
# NB: should not sleep
val = async_result.wait(10)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local variable 'val' is assigned to but never used

@jgh-ds
Copy link
Contributor Author

jgh-ds commented Nov 6, 2017

Updated jgh-ds:master with a new test case for the issue: test_wait_race()

@jgh-ds
Copy link
Contributor Author

jgh-ds commented Nov 6, 2017

It should be ready to merge now, including a test case.

Thanks.

@@ -200,6 +200,28 @@ def wait_for_val():
eq_(lst, [True])
th.join()

def test_wait_race(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about adding a test, but I didn't do it because a failure would cause the test code to hang forever.

Oh yeah. I forgot about that. 😄

Your test looks good to me, however it's just complicated enough that it's not immediately apparent what you're trying to test. So do you mind adding a simple docstring that explains what this is trying to test/guard against?

Maybe something like:

"""Test that there is no race condition in `IAsyncResult.wait()`.
     
Guards against https://github.com/python-zk/kazoo/issues/485 reappearing.
""""

async_result.wait(10)
cv.set()
except ImportError:
pass
Copy link
Member

@jeffwidman jeffwidman Nov 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed? what would cause an ImportError?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from test_wait(). I'll remove it.

@jgh-ds
Copy link
Contributor Author

jgh-ds commented Nov 6, 2017

Added the doc string and removed the unnecessary try/except from the test case. Also verified that failing the test case will not hang the testing process.

jeffwidman
jeffwidman previously approved these changes Nov 7, 2017
Copy link
Member

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good to me, but I'd prefer to have one more pair of eyes before merging.

Hopefully @hannosch / @harlowja / @bbangert can take a look as well

@jeffwidman
Copy link
Member

I'm going to go ahead and merge. The test is the only part I'm slightly unsure of, but I am confident the race condition exists and your fix looks good.

@jeffwidman jeffwidman closed this Nov 7, 2017
@jeffwidman jeffwidman reopened this Nov 7, 2017
@jgh-ds
Copy link
Contributor Author

jgh-ds commented Nov 8, 2017

I've determined that the part of the test which is:

        if not cv.is_set():
            async_result.set("kick")

is not needed (but is harmless). I was thinking that it would be necessary in the failure case, but the assertion failure from eq_(_) keeps it from being run in that case anyway. Since the thread has a timeout anyway, it cannot hang the testing process.

Would it be better to remove it?

@jeffwidman
Copy link
Member

jeffwidman commented Nov 8, 2017

Yes, please remove it. No one later will realize it can be removed.

Re: the merge, I accidentally clicked the wrong button, then had to wait til tests passed... once you make this change and tests pass, I will merge.

@jgh-ds
Copy link
Contributor Author

jgh-ds commented Nov 8, 2017

The dead code in the test is removed now.

@jgh-ds
Copy link
Contributor Author

jgh-ds commented Nov 8, 2017

Apparently the removed code is needed in Python 3.5 and 3.6 to avoid hanging.

@jgh-ds
Copy link
Contributor Author

jgh-ds commented Nov 8, 2017

The test failures are not on the new test case. The actual IAsyncResult.wait() code is the same as the first commit, where all the tests pass.

Could other test cases depend on the old (broken) behavior? The latest errors are on test_race_condition_new_partitioner_steals_the_lock

@jgh-ds
Copy link
Contributor Author

jgh-ds commented Nov 8, 2017

I've verified that the failing test case in test_race_condition_new_partitioner_steals_the_lock does not use the modified IAsyncResult.wait() code.
It appears that test_race_condition_new_partitioner_steals_the_lock is sensitive to execution performance (using 0.1 second and 0.2 second time constants). Perhaps the Travis environment is slower now.

@jgh-ds
Copy link
Contributor Author

jgh-ds commented Nov 8, 2017

It should be good to go now.

Copy link
Member

@jeffwidman jeffwidman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jeffwidman jeffwidman merged commit 4d268ad into python-zk:master Nov 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IAsyncResult wait() has a race.
3 participants