[AMP] CUDA support for mixed precision pass

Solve issues and make modifications to support CUDA for mixed precision pass here: https://github.com/apache/tvm/pull/8069

Current initial issues as described by @Lunderberg 

> On the cuda side, it's failing a check that requires 16-bit floats to be used in pairs.
>> Check failed: lanes % 2 == 0 (1 vs. 0) : only support even lane for half type

This issue is completed when unit tests can pass for CUDA target.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMP] CUDA support for mixed precision pass #8294

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development