-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cachemask variant for fake_quantize_affine #500
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/500
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit d70f92c with merge base aef7e09 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
) | ||
|
||
Args: | ||
input (torch.Tensor): original float32, float16 or bfloat16 Tensor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can dedup the comments by: https://github.com/pytorch/pytorch/blob/06ebf87a1eca6c345f5e3e39b63c2ef487695043/torch/ao/quantization/observer.py#L582
e.g.
:func:`~torchao.quantization.quant_primitives.fake_quantize_affine`
?
4e4dd12
to
ab7f401
Compare
General fake quantize op for quantization-aware training (QAT). | ||
This is equivalent to calling `quantize_affine` + `dequantize_affine` | ||
but without the dtype casts. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a section for Args
and link to fake_quantize_affine
outlier mask for intermediate quantized values | ||
) | ||
|
||
Please refer to :func:`~torchao.quantization.quant_primitives.fake_quantize_affine` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I think we can move this to before Returns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG, had a comment on updating the docstring a bit
Summary: In QAT, we often wish to filter out the gradients corresponding to values outside the expected quantization range, for example: ``` q = _quantize_affine_no_dtype_cast(...) dq = _dequantize_affine_no_dtype_check(...) mask = torch.logical_and((q >= quant_min), (q <= quant_max)) grad = grad * mask ``` The existing `fake_quantize_affine` returns the dequantized values only, so callers do not have access to this mask. This commit adds the variant to this op that returns both the dequantized values and the mask, similar to `fake_quantize_per_tensor_affine_cachemask` in core. Test Plan: python test/quantization/test_quant_primitives.py -k test_fake_quantize_affine_cachemask
ab7f401
to
d70f92c
Compare
Summary: In QAT, we often wish to filter out the gradients corresponding to values outside the expected quantization range, for example: ``` q = _quantize_affine_no_dtype_cast(...) dq = _dequantize_affine_no_dtype_check(...) mask = torch.logical_and((q >= quant_min), (q <= quant_max)) grad = grad * mask ``` The existing `fake_quantize_affine` returns the dequantized values only, so callers do not have access to this mask. This commit adds the variant to this op that returns both the dequantized values and the mask, similar to `fake_quantize_per_tensor_affine_cachemask` in core. Test Plan: python test/quantization/test_quant_primitives.py -k test_fake_quantize_affine_cachemask
Summary: In QAT, we often wish to filter out the gradients corresponding to values outside the expected quantization range, for example:
The existing
fake_quantize_affine
returns the dequantized values only, so callers do not have access to this mask. This commit adds the variant to this op that returns both the dequantized values and the mask, similar tofake_quantize_per_tensor_affine_cachemask
in core.Test Plan:
python test/quantization/test_quant_primitives.py -k test_fake_quantize_affine_cachemask