Skip to content

Conversation

@pentschev
Copy link
Member

The UCX-Py project is deprecated and going to be archived very soon. To continue using protocol="ucx" users will need to install the new distributed-ucxx package, that contains the communication plugin for UCXX. This PR removes the existing communicator implementation based on UCX-Py, registering a new backend using the distributed_ucxx module (when installed), otherwise raise a deprecation warning and exception back to the user when protocol="ucx" is used.

  • Tests added / passed
  • Passes pre-commit run --all-files

@pentschev pentschev requested a review from fjetter as a code owner September 3, 2025 21:21
@github-actions
Copy link
Contributor

github-actions bot commented Sep 3, 2025

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

    27 files  ±    0      27 suites  ±0   11h 26m 34s ⏱️ + 22m 59s
 4 110 tests  -     6   4 001 ✅ +    1    104 💤  -  7  4 ❌ ±0  1 🔥 ±0 
51 491 runs  +1 142  49 319 ✅ +1 197  2 166 💤  - 56  5 ❌ +1  1 🔥 ±0 

For more details on these failures and errors, see this check.

Results for commit 52a17b3. ± Comparison against base commit 0f0adef.

This pull request removes 7 and adds 1 tests. Note that renamed tests count towards both.
distributed.comm.tests.test_comms ‑ test_ucx_client_server
distributed.comm.tests.test_ucx
distributed.comm.tests.test_ucx_config
distributed.tests.test_nanny ‑ test_nanny_closed_by_keyboard_interrupt[tcp]
distributed.tests.test_nanny ‑ test_nanny_closed_by_keyboard_interrupt[ucx]
distributed.tests.test_worker ‑ test_protocol_from_scheduler_address[Nanny]
distributed.tests.test_worker ‑ test_protocol_from_scheduler_address[Worker]
distributed.tests.test_nanny ‑ test_nanny_closed_by_keyboard_interrupt

♻️ This comment has been updated with latest results.

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pentschev. Looking forward to this landing.

@jacobtomlinson
Copy link
Member

Test failures appear unrelated. Merging

@jacobtomlinson jacobtomlinson merged commit 2d19d1c into dask:main Sep 4, 2025
27 of 34 checks passed
@pentschev
Copy link
Member Author

Thanks @jacobtomlinson ! 😄

rapids-bot bot pushed a commit to rapidsai/rapids-dask-dependency that referenced this pull request Sep 4, 2025
Rewrite the UCXX/UCX-Py proxy backends added in #116 to deprecate UCX-Py. We have already removed the UCX-Py backend from Distributed in dask/distributed#9105, but while there's no release we cannot rely on that, so the changes in the PR work in the same way we expect when the migration is completed and Distributed has a new release. If distributed-ucxx is not installed a warning and exception are raised telling the user to install it.

The state after this PR is as follows:

1. If distributed-ucxx is installed:
    1. `protocol="ucx"` / `protocol="ucxx"` should continue working as before, both were already using distributed-ucxx
    2. `protocol="ucx-old"` (only exists in rapids-dask-dependency added in #116) will raise the same `FutureWarning` as `protocol="ucx"` without RAPIDS
2. If distributed-ucxx is NOT installed:
    1. `protocol="ucx"` / `protocol="ucxx"` / `protocol="ucx-old"` all raise a `FutureWarning` and don’t let users create processes.

The final state is expected to be achieved when a new Distributed release is cut and the `__rdd_patch_ucx.py` file is removed from rapids-dask-dependency, which will be:

1. If distributed-ucxx is installed:
    1. `protocol="ucx"` (now transparently uses distributed-ucxx) / `protocol="ucxx"` (always used distributed-ucxx) should continue working as before
2. If distributed-ucxx is NOT installed:
    1. `protocol="ucx"` raises a `FutureWarning` and doesn’t let users create processes
    2. `protocol="ucxx"` never existed before distributed-ucxx so it will just fail with a wrong protocol name

Specifically I've tested the following to confirm what's behavior matches what is described above as the state of this PR (using all `ucx`/`ucxx`/`ucx-old` combinations):

1. `dask scheduler --protocol=ucx`
2. `dask worker ucx://127.0.0.1:8786`
3. `LocalCluster(protocol="ucx", silence_logs=False)`
4. `LocalCUDACluster(protocol="ucx", silence_logs=False)`

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Tom Augspurger (https://github.com/TomAugspurger)

URL: #120
rapids-bot bot pushed a commit to rapidsai/ucxx that referenced this pull request Sep 5, 2025
)

Now that Distributed deprecated `ucx://` in dask/distributed#9105, we should now ensure `distributed-ucxx` registers both backends. This is currently handled by rapids-dask-dependency, but it will ultimately be removed from there when everything is in place.

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)
  - Mads R. B. Kristensen (https://github.com/madsbk)

URL: #504
rapids-bot bot pushed a commit to rapidsai/ucxx that referenced this pull request Sep 23, 2025
Some changes were needed to make rapids-dask-dependency patches enable distributed-ucxx as the handler for Distributed comms protocols. Once that is completely removed from rapids-dask-dependency and a new Distributed release containing dask/distributed#9105 is out, changes from this PR can be merged.

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Mads R. B. Kristensen (https://github.com/madsbk)

URL: #505
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants