Closed
Description
Yes, it does. It is our attention kernel that does not support FP32. More precisely, our attention kernel currently does not support some block sizes when FP32 is used. I will fix this in the future.
Originally posted by @WoosukKwon in #70 (comment)
Metadata
Metadata
Assignees
Labels
No labels