Skip to content

Conversation

@pentschev
Copy link
Member

Rewrite the UCXX/UCX-Py proxy backends added in #116 to deprecate UCX-Py. We have already removed the UCX-Py backend from Distributed in dask/distributed#9105, but while there's no release we cannot rely on that, so the changes in the PR work in the same way we expect when the migration is completed and Distributed has a new release. If distributed-ucxx is not installed a warning and exception are raised telling the user to install it.

The state after this PR is as follows:

  1. If distributed-ucxx is installed:
    1. protocol="ucx" / protocol="ucxx" should continue working as before, both were already using distributed-ucxx
    2. protocol="ucx-old" (only exists in rapids-dask-dependency added in Add UCXX proxy backend #116) will raise the same FutureWarning as protocol="ucx" without RAPIDS
  2. If distributed-ucxx is NOT installed:
    1. protocol="ucx" / protocol="ucxx" / protocol="ucx-old" all raise a FutureWarning and don’t let users create processes.

The final state is expected to be achieved when a new Distributed release is cut and the __rdd_patch_ucx.py file is removed from rapids-dask-dependency, which will be:

  1. If distributed-ucxx is installed:
    1. protocol="ucx" (now transparently uses distributed-ucxx) / protocol="ucxx" (always used distributed-ucxx) should continue working as before
  2. If distributed-ucxx is NOT installed:
    1. protocol="ucx" raises a FutureWarning and doesn’t let users create processes
    2. protocol="ucxx" never existed before distributed-ucxx so it will just fail with a wrong protocol name

Specifically I've tested the following to confirm what's behavior matches what is described above as the state of this PR (using all ucx/ucxx/ucx-old combinations):

  1. dask scheduler --protocol=ucx
  2. dask worker ucx://127.0.0.1:8786
  3. LocalCluster(protocol="ucx", silence_logs=False)
  4. LocalCUDACluster(protocol="ucx", silence_logs=False)

@pentschev pentschev self-assigned this Sep 4, 2025
@pentschev pentschev added breaking Introduces a breaking change feature request New feature or request labels Sep 4, 2025
@pentschev pentschev mentioned this pull request Sep 4, 2025
14 tasks
@pentschev pentschev marked this pull request as ready for review September 4, 2025 19:49
@pentschev pentschev requested a review from a team as a code owner September 4, 2025 19:49
@pentschev
Copy link
Member Author

I think this should be good to go, and we need it before rapidsai/ucxx#504 can be merged.

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a warning and exception are raised telling the user to install it.

I suspect we don't need a warning and an exception. Just an exception with a clear error message would be sufficient. But since we're matching was done in distributed it's not a big deal.

@pentschev
Copy link
Member Author

pentschev commented Sep 4, 2025

I suspect we don't need a warning and an exception. Just an exception with a clear error message would be sufficient. But since we're matching was done in distributed it's not a big deal.

I wanted to be sure this is visible in all cases (with subprocesses, without subprocesses, with/without logging visible etc.). It always confuses me what is more useful/readable in Dask, so having both seemed like a good idea.

@pentschev
Copy link
Member Author

I'll go ahead and merge this so I can still test the other PR today, hopefully this won't break anything. If anybody has further comments I'd be happy to address in a follow-up PR. Thanks Tom for the review!

@pentschev
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 6713c6a into rapidsai:branch-25.10 Sep 4, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Introduces a breaking change feature request New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants