Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
stage_1_and_2: optimize clip calculation to use clamp (deepspeedai#5632)
instead of "if" that causes host/device synchronization and introduces a bubble, while clamp is hapenning on the device
- Loading branch information