Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove gen.coroutine usage in scheduler #3242

Merged
merged 2 commits into from
Nov 15, 2019

Conversation

jcrist
Copy link
Member

@jcrist jcrist commented Nov 15, 2019

Use async/await and asyncio idioms throughout.

Use `async`/`await` and `asyncio` idioms throughout.
raise Return(
{
"status": "missing-data",
"keys": sum([r["keys"] for r in result if "keys" in r], []),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this wrong before? I would have expected this to iterate over result.values

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was wrong before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really good to have a second pair of eyes here

@mrocklin
Copy link
Member

In principle everything here looks good to me. Thanks for doing this @jcrist

I'm a little curious why I didn't change some of these over in the past. I wonder if there was some odd thing going on before.

@jcrist
Copy link
Member Author

jcrist commented Nov 15, 2019

Looks like not all tests pass, so will have to dive deeper. Only ran a few checks locally since the test suite takes a while. In general what's your process with testing locally? Many things seem related, so its hard to know what tests are likely to fail.

@mrocklin
Copy link
Member

I usually run one of the files that I think is relevant, probably test_core.py, test_scheduler.py, or test_client.py. I find some simple failing test in that group, hopefully one that is async, and then run that one repeatedly while i try to find the issue.

As context, before we got very stringent about checking for thread/process/fd leaks before and after every test (the cleanup fixture) each of these used to take around 20s. We might look into profiling our testing harness at some point.

@mrocklin
Copy link
Member

  File "/home/travis/build/dask/distributed/distributed/scheduler.py", line 2012, in cancel_key
    lambda _: self.cancel_key(key, client, retries - 1),
  File "/home/travis/miniconda/envs/test-environment/lib/python3.7/site-packages/tornado/ioloop.py", line 693, in add_future
    assert is_future(future)

It looks like this might be a relatively routine case of Tornado/Asyncio mismatch. This is probably why I left that one there. We could also try replacing loop.add_callback/loop.add_future with asyncio.ensure_future in places.

@jcrist
Copy link
Member Author

jcrist commented Nov 15, 2019

Yeah, I'm going to keep pushing on this until we can remove this aspect of tornado from the codebase. Just need to go deeper.

self.loop.add_future(
gen.sleep(0.2), lambda _: self.cancel_key(key, client, retries - 1)
self.loop.call_later(
0.2, lambda _: self.cancel_key(key, client, retries - 1)
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. There's no reason to manually call gen.sleep here, using IOLoop.call_later works just as well.

@jcrist
Copy link
Member Author

jcrist commented Nov 15, 2019

All tests pass. Good to merge?

@mrocklin mrocklin merged commit 1a46657 into dask:master Nov 15, 2019
@mrocklin
Copy link
Member

Woot. Thanks @jcrist !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants