Skip to content

Update supported dtypes for fp8 #1573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 17, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions torchao/quantization/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ from torchao.quantization import quantize_, float8_weight_only
quantize_(model, float8_weight_only())
```

This API is only tested on H100. Hardware with CUDA compute capability 8.9 or greater is required.
Supports all dtypes for original weight and activation. This API is only tested on H100. Hardware with CUDA compute capability 8.9 or greater is required.

#### A8W8 Float8 Dynamic Quantization with Tensorwise Scaling

Expand All @@ -166,7 +166,7 @@ from torchao.quantization import quantize_, float8_dynamic_activation_float8_wei
quantize_(model, float8_dynamic_activation_float8_weight(granularity=PerTensor()))
```

This API is only tested on H100. Hardware with CUDA compute capability 8.9 or greater is required.
Supports all dtypes for original weight and activation. This API is only tested on H100. Hardware with CUDA compute capability 8.9 or greater is required.

### A8W8 Float8 Dynamic Quantization with Rowwise Scaling

Expand All @@ -176,7 +176,7 @@ from torchao.quantization import quantize_, PerRow, float8_dynamic_activation_fl
quantize_(model, float8_dynamic_activation_float8_weight(granularity=PerRow()))
```

This API is only tested on H100. Hardware with CUDA compute capability 8.9 or greater is required.
Per-row scaling is only supported for bfloat16 weight and activation. This API is only tested on H100. Hardware with CUDA compute capability 8.9 or greater is required.

#### A16W6 Floating Point WeightOnly Quantization

Expand Down
Loading