-
Notifications
You must be signed in to change notification settings - Fork 273
Add module swap quantization API from Quanty #1886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1886
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 5001ba6 with merge base 8c81863 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
bf0a842
to
715d6cd
Compare
@@ -0,0 +1,56 @@ | |||
import copy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO these tests should be in test/prototype/quantization/module_swap
31bd9f1
to
38b1361
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a short README.md of prototype/module_swap_quantization
would be nice as well
2ed4d2d
to
16022cf
Compare
**Summary:** This commit adds a module-swap-based PTQ API from Quanty, including: - Quantized linear and embedding modules - `IntQuantizer` to specify how to quantize weights and activations - `CodeBookQuantizer` as an alternative to IntQuantizer - Implementation of K-means to be used for codebook quantization - Range setting and data getter utility These new APIs will complement our existing `quantize_` API, which is primarily used for tensor-subclass-based quantization today (though it can also support module swaps). All APIs introduced in this commit are under prototype and highly subject to change. In particular, we plan to delete `quantize_module_swap` and `QuantizationRecipe`, and instead integrate this flow with the `quantize_` API by creating a new `AOBaseConfig`. All code is migrated from Quanty and written by @TiRune. **Test Plan:** python test/quantization/module_swap/test_*
16022cf
to
5001ba6
Compare
Thanks, added the README. Merging this! |
Summary: This commit adds a module-swap-based PTQ API from Quanty, including:
IntQuantizer
to specify how to quantize weights and activationsCodeBookQuantizer
as an alternative to IntQuantizerThese new APIs will complement our existing
quantize_
API, which is primarily used for tensor-subclass-based quantization today (though it can also support module swaps). All APIs introduced in this commit are under prototype and highly subject to change. In particular, we plan to deletequantize_module_swap
andQuantizationRecipe
, and instead integrate this flow with thequantize_
API by creating a newAOBaseConfig
.All code is migrated from Quanty and written by @TiRune.
Test Plan:
python test/quantization/module_swap/test_*