Skip to content

Commit 9c048eb

Browse files
jerryzh168facebook-github-bot
authored andcommitted
Change choose_qparams_per_token to choose_qparams_per_token_asymmetric (#61)
Summary: Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #61 This is needed for xnnpack, we can support other patterns later Pull Request resolved: #61 Test Plan: CI, will be tested when xnnpack lowering is ready Reviewed By: andrewor14 Differential Revision: D55031545 Pulled By: jerryzh168 fbshipit-source-id: 3908bf0e6e5638b611300b0a45cedabb3c0592b3
1 parent fbd88a7 commit 9c048eb

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

torchao/quantization/quant_primitives.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1070,10 +1070,11 @@ def unpack_int4_to_int8(int8_data: torch.Tensor) -> torch.Tensor:
10701070

10711071
def per_token_dynamic_quant(input: torch.Tensor) -> torch.Tensor:
10721072
orig_dtype = input.dtype
1073+
# TODO: we may need to make the choose_qparams op configurable
10731074
(
10741075
scales,
10751076
zero_points,
1076-
) = torch.ops.quantized_decomposed.choose_qparams_per_token(input, torch.int8)
1077+
) = torch.ops.quantized_decomposed.choose_qparams_per_token_asymmetric(input, torch.int8)
10771078

10781079
# TODO: get these from torch.int8
10791080
quant_min = -128

0 commit comments

Comments
 (0)