Skip to content

Conversation

@gjoseph92
Copy link
Owner

Run many tests from dask/dask in distributed's CI against the new P2P shuffle. Currently this runs all of test_shuffle, test_groupby, and test_multi.

I put the shuffle tests in their own CI job by just extending the ci1/not ci1 matrix. It looks like it takes 9-12min to run them reusing the same cluster for every test. If I removed scope="module", I imagine it would be much, much slower.

This isn't how we'd want to implement it for real, just trying this to get a sense of runtimes. If we actually did it, we'd want to collect every test using the shuffle_method fixture (this currently both misses some tests, and runs some extraneous tests that don't use the fixture). The pytest-within-pytest incantations did not seem worth figuring out yet, since I'm not sure if we'll want to use this approach at all.

So this indicates running the dask/dask tests in distributed would be possible, though maybe not a good idea.

cc @fjetter

Just testing this out. This also runs unnecessary tests (ones that don't even use the `shuffle_method` fixture).
If we were to really do this, we'd want some way to make pytest search for all tests in dask that use the fixture/have a mark, and import/run them dynamically. Possibly using `pytest.main`?
Just to see how long it takes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants