Closed
Description
Solve issues and make modifications to support CUDA for mixed precision pass here: #8069
Current initial issues as described by @Lunderberg
On the cuda side, it's failing a check that requires 16-bit floats to be used in pairs.
Check failed: lanes % 2 == 0 (1 vs. 0) : only support even lane for half type
This issue is completed when unit tests can pass for CUDA target.
Metadata
Assignees
Labels
No labels