Skip to content

Conversation

andrewor14
Copy link
Contributor

Summary: Int4WeightOnlyConfig supports version 1 (targeting tinygemm) and version 2 (targeting fbgemm). However, the latter requires a new dependency (fbgemm_gpu_genai >= 1.2.0), which is problematic for torchao integrations with other frameworks. For now, we should continue to support the v1 path for BC.

Test Plan:

python test/quantization/test_qat.py -k test_infer_int4_weight_only_config

Copy link

pytorch-bot bot commented Aug 27, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2888

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit 80ccdbc with merge base 6f035e8 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 27, 2025
@andrewor14 andrewor14 requested review from jerryzh168 and vkuzo August 27, 2025 14:20
@andrewor14 andrewor14 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Aug 27, 2025
@vkuzo
Copy link
Contributor

vkuzo commented Aug 27, 2025

I think this PR should also test the int4 workflow with version 1, can we add that?

**Summary:** `Int4WeightOnlyConfig` supports version 1 (targeting
tinygemm) and version 2 (targeting fbgemm). However, the latter
requires a new dependency (fbgemm_gpu_genai >= 1.2.0), which is
problematic for torchao integrations with other frameworks.
For now, we should continue to support the v1 path for BC.

**Test Plan:**
```
python test/quantization/test_qat.py -k
test_infer_int4_weight_only_config
```
@andrewor14
Copy link
Contributor Author

I think this PR should also test the int4 workflow with version 1, can we add that?

added a test

"""
self._test_quantize_api_against_ptq(
Int4WeightOnlyConfig(version=version),
target_prepare_sqnr=12,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean the prepare numerics does not match convert numerics very well? will this be an issue

@andrewor14 andrewor14 merged commit 6e9bf26 into main Aug 28, 2025
18 checks passed
vkuzo pushed a commit that referenced this pull request Aug 28, 2025
**Summary:** `Int4WeightOnlyConfig` supports version 1 (targeting
tinygemm) and version 2 (targeting fbgemm). However, the latter
requires a new dependency (fbgemm_gpu_genai >= 1.2.0), which is
problematic for torchao integrations with other frameworks.
For now, we should continue to support the v1 path for BC.

**Test Plan:**
```
python test/quantization/test_qat.py -k
test_infer_int4_weight_only_config
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants