You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+9-6Lines changed: 9 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ This repository holds PyTorch bindings maintained by Intel® for the Intel® one
6
6
7
7
[PyTorch](https://github.com/pytorch/pytorch) is an open-source machine learning framework.
8
8
9
-
[Intel® oneCCL](https://github.com/oneapi-src/oneCCL) (collective communications library) is a library for efficient distributed deep learning training, implementing collectives like `allreduce`, `allgather`, `alltoall`. For more information on oneCCL, please refer to the [oneCCL documentation](https://spec.oneapi.com/versions/latest/elements/oneCCL/source/index.html).
9
+
[Intel® oneCCL](https://github.com/oneapi-src/oneCCL) (collective communications library) is a library for efficient distributed deep learning training, implementing collectives like `allreduce`, `allgather`, `alltoall`. For more information on oneCCL, please refer to the [oneCCL documentation](https://oneapi-spec.uxlfoundation.org/specifications/oneapi/latest/elements/oneccl/source/).
10
10
11
11
`oneccl_bindings_for_pytorch` module implements PyTorch C10D ProcessGroup API and can be dynamically loaded as external ProcessGroup and only works on Linux platform now.
12
12
@@ -23,7 +23,7 @@ The table below shows which functions are available for use with CPU / Intel dGP
23
23
|`reduce`| √ | √ |
24
24
|`all_gather`| √ | √ |
25
25
|`gather`| √ | √ |
26
-
|`scatter`|×|×|
26
+
|`scatter`|√|√|
27
27
|`reduce_scatter`| √ | √ |
28
28
|`all_to_all`| √ | √ |
29
29
|`barrier`| √ | √ |
@@ -36,6 +36,7 @@ We recommend using Anaconda as Python package management system. The followings
| ONECCL_BINDINGS_FOR_PYTORCH_ENV_VERBOSE | 0 | Set verbose level in ONECCL_BINDINGS_FOR_PYTORCH|
81
+
| ONECCL_BINDINGS_FOR_PYTORCH_ENV_VERBOSE | 0 | Set verbose level in oneccl_bindings_for_pytorch|
81
82
| ONECCL_BINDINGS_FOR_PYTORCH_ENV_WAIT_GDB | 0 | Set 1 to force the oneccl_bindings_for_pytorch wait for GDB attaching |
82
83
| TORCH_LLM_ALLREDUCE | 0 | Set 1 to enable this prototype feature for better scale-up performance. This is a prototype feature to provide better scale-up performance by enabling optimized collective algorithms in oneCCL and asynchronous execution in torch-ccl. This feature requires XeLink enabled for cross-cards communication.|
83
84
| CCL_BLOCKING_WAIT | 0 | Set 1 to enable this prototype feature, which is to control whether collectives execution on XPU is host blocking or non-blocking. |
@@ -91,6 +92,7 @@ The following launch options are supported in Intel® oneCCL Bindings for PyTorc
0 commit comments