Skip to content

Commit f520c91

Browse files
authored
Update supported dtypes for fp8 (#1573)
1 parent eea4d25 commit f520c91

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

torchao/quantization/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,7 @@ from torchao.quantization import quantize_, float8_weight_only
156156
quantize_(model, float8_weight_only())
157157
```
158158

159-
This API is only tested on H100. Hardware with CUDA compute capability 8.9 or greater is required.
159+
Supports all dtypes for original weight and activation. This API is only tested on H100. Hardware with CUDA compute capability 8.9 or greater is required.
160160

161161
#### A8W8 Float8 Dynamic Quantization with Tensorwise Scaling
162162

@@ -166,7 +166,7 @@ from torchao.quantization import quantize_, float8_dynamic_activation_float8_wei
166166
quantize_(model, float8_dynamic_activation_float8_weight(granularity=PerTensor()))
167167
```
168168

169-
This API is only tested on H100. Hardware with CUDA compute capability 8.9 or greater is required.
169+
Supports all dtypes for original weight and activation. This API is only tested on H100. Hardware with CUDA compute capability 8.9 or greater is required.
170170

171171
### A8W8 Float8 Dynamic Quantization with Rowwise Scaling
172172

@@ -176,7 +176,7 @@ from torchao.quantization import quantize_, PerRow, float8_dynamic_activation_fl
176176
quantize_(model, float8_dynamic_activation_float8_weight(granularity=PerRow()))
177177
```
178178

179-
This API is only tested on H100. Hardware with CUDA compute capability 8.9 or greater is required.
179+
Per-row scaling is only supported for bfloat16 weight and activation. This API is only tested on H100. Hardware with CUDA compute capability 8.9 or greater is required.
180180

181181
#### A16W6 Floating Point WeightOnly Quantization
182182

0 commit comments

Comments
 (0)