Open
Description
openedon May 30, 2024
Describe the issue
when i use gemm_float8 to run with input A(fp8 e5m2), input B(fp8 e4m3), can not run, but input A(fp8 e4m3), input B(fp8 e4m3) will run right,
To reproduce
run gemm_float8
Urgency
No response
Platform
Linux
OS Version
centos7.6
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.17.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
cuda 12.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment