Skip to content

[Perf] Explore more performant Fp8 Casting #559

Open
@vkuzo

Description

@vkuzo

from @drisspg

Summary

There are two components to this, non_saturated casting and saturated casting.

Non-Saturated casting

  • We are currently using bit logic to cast from fp32 to fp8 where as there exists intrinsics to perform the same, see Nikitas comment below.
  • Currently for fp16 -> fp8 casting we actually first rescaled fp16 to fp32 and then recast to fp8.

Saturated Casting

copied from pytorch-labs/float8_experimental#83

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions