Closed
Description
Recently came across this repo that's doing fp8 inference https://github.com/aredden/flux-fp8-api/blob/main/float8_quantize.py
It's getting popular enough that we should consider just making this a setting for autoquant
Recently came across this repo that's doing fp8 inference https://github.com/aredden/flux-fp8-api/blob/main/float8_quantize.py
It's getting popular enough that we should consider just making this a setting for autoquant