v1.2.0
This release adds better support for low-precision data.
- Support fp16 (IEEE half-precision) in all backends when support is available.
- Support bfloat16 in all backends when support is available.
- The NCCL/RCCL backend now supports averaging as a reduction operator (
avg
). - Aluminum now requires at least CUDA 11 / ROCm 5 when GPU support is requested.