[Quantization] [Compressed Tensors] Support Transforms, Fix Tests #42887

kylesayrs · 2025-12-15T23:46:19Z

Purpose

Support loading models with online transforms applied via Compressed Tensors (LLM Compressor)
Fix tests which referenced deleted models

Background

Transforms are extra weights added to a model which improve accuracy recovery from quantization. These extra weights are required to be shared in order to reduce memory requirements of the model.

Prerequisites

Transformers: [Tests] Fix CompressedTensors tests #42935
Compressed Tensors: [Transforms] Support loading transforms in transformers

Changes

Require a minimum compressed tensors version of 0.13.1 (to support transform features)
Apply transforms to the model before weight loading
- apply_transform_config implements _update_transforms_tied_weights, which leverages @Cyrilvallez 's refactored tie_weights functionality!

Implemented in CT's apply_transform_config

def _update_transforms_tied_weights(model: torch.nn.Module):
    """
    This function updates the `_tied_weights_keys` and `all_tied_weights_keys`
    attributes of the given model with transform weights.

    This function is needed because transformers only knows which weights are shared
    via the `_tied_weights_keys` attributes. These attributes are used to tie
    weights after the model has loaded.

    CompressedTensors does not enforce a particular weight is the source weight :.
    We rely on correctness of the following mapping in PreTrainedModel.tie_weights():
    ```
    B -> A
    C -> A
    D -> A

    Where any of A,B,C,D might be the loaded source weight
    ```
    This property is tested by `test_modeling_utils::BaseModelWithMultipleTiedWeights`
    """

Example _tied_weights_keys:

"model.layers.1.self_attn_.q_proj.v_input.weight": "model.layers.0.self_attn_.q_proj.v_input.weight",
"model.layers.2.self_attn_.q_proj.v_input.weight": "model.layers.0.self_attn_.q_proj.v_input.weight",
"model.layers.3.self_attn_.q_proj.v_input.weight": "model.layers.0.self_attn_.q_proj.v_input.weight",
...
"model.layers.1.self_attn_.q_proj.u_output.weight": "model.layers.0.self_attn_.q_proj.u_output.weight",
"model.layers.2.self_attn_.q_proj.u_output.weight": "model.layers.0.self_attn_.q_proj.u_output.weight",
"model.layers.3.self_attn_.q_proj.u_output.weight": "model.layers.0.self_attn_.q_proj.u_output.weight",

Testing

Regression tested using CompressedTensorsTest, added an online quip-style transformed model for testing
- Perplexity results match expectation

Suggested Reviewers

@SunMarc @Cyrilvallez @Rocketknight1

Rocketknight1 · 2025-12-16T14:49:47Z

cc @MekkCyber for quantization

kylesayrs · 2025-12-16T15:28:52Z

make fix-copies does not fix the CI 🥲

SunMarc

Thanks, overall LGTM ! Just a few nits

src/transformers/quantizers/quantizer_compressed_tensors.py

src/transformers/utils/quantization_config.py

tests/quantization/compressed_tensors_integration/test_compressed_tensors.py

MekkCyber

Thank you for the fix @kylesayrs

src/transformers/utils/import_utils.py

src/transformers/quantizers/quantizer_compressed_tensors.py

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

github-actions · 2025-12-17T20:07:42Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: compressed_tensors_integration

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

github-actions · 2025-12-17T20:31:55Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=42887&sha=3a71ff

kylesayrs · 2025-12-17T20:37:55Z

@SunMarc @MekkCyber

Recent changes from #42882 ended up breaking this PR as implemented. This is because calling preprocess within the init_empty_weights context creates meta tensors which are no longer identical. I actually think that this is fine and cleaner behavior, but it does mean that this PR needs to be refactored.

I'm moving _update_transforms_tied_weights into apply_transform_config and refactoring the function to work around init_empty_weights (that's done here). Once CT releases, I'll reopen this PR so we can start supporting transforms in transformers :)

Thanks

Cyrilvallez · 2025-12-18T10:44:16Z

Thanks @kylesayrs! As discussed offline, indeed it won't matter which weight is actually present in the checkpoint for tying them! Letting you with @SunMarc and @MekkCyber for the quantization part!

SunMarc · 2025-12-18T13:19:42Z

Recent changes from #42882 ended up breaking this PR as implemented. This is because calling preprocess within the init_empty_weights context creates meta tensors which are no longer identical. I actually think that this is fine and cleaner behavior, but it does mean that this PR needs to be refactored.

Oh indeed, as most quants methods calling preprocess is actually calling with_init_empty_weights again, I didn't predict that it will create this kind of issue. Let us know if you manage to fix it ! If there are really no way and it's too breaking, we can revert this. But as you said, I think this behavior is probably better.

kylesayrs mentioned this pull request Dec 15, 2025

[Quantization] Allow loading of transform configs #40673

Closed

kylesayrs force-pushed the kylesayrs/transforms branch from a65d99e to d56f657 Compare December 16, 2025 15:46

SunMarc reviewed Dec 17, 2025

View reviewed changes

SunMarc requested a review from MekkCyber December 17, 2025 13:23

MekkCyber reviewed Dec 17, 2025

View reviewed changes

src/transformers/utils/import_utils.py Outdated Show resolved Hide resolved

src/transformers/quantizers/quantizer_compressed_tensors.py Outdated Show resolved Hide resolved

kylesayrs added 8 commits December 17, 2025 17:32

implement transforms, update tests

5ae82b7

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

fix style, add version guard

68e9496

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

make style, small fixes to tie_weights

6649224

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

hopefully now fix style

5bf4552

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

reduce diff

f76ef73

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

fix typo

60d3339

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

fix rebase, fix style pls

16ae486

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

add docstrings

77a0036

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs force-pushed the kylesayrs/transforms branch from d56f657 to 77a0036 Compare December 17, 2025 17:58

kylesayrs added 4 commits December 17, 2025 19:20

maybe now style will pass

11cc748

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

reduce diff

e51f65f

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

fix initialization, need approval from @SunMarc

afe33b6

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

part 2

14b21ee

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs added 3 commits December 17, 2025 20:20

reduce diff

c9f70b9

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

move _update_transforms_tied_weights to ct

1469797

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

bump version requirement

3a71ffc

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

kylesayrs mentioned this pull request Dec 17, 2025

[Transforms] Support loading transforms in transformers vllm-project/compressed-tensors#528

Open

kylesayrs mentioned this pull request Dec 17, 2025

[Tests] Fix CompressedTensors tests #42935

Merged

kylesayrs marked this pull request as draft December 17, 2025 20:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Quantization] [Compressed Tensors] Support Transforms, Fix Tests #42887

[Quantization] [Compressed Tensors] Support Transforms, Fix Tests #42887

kylesayrs commented Dec 15, 2025 •

edited

Loading

Uh oh!

Rocketknight1 commented Dec 16, 2025

Uh oh!

kylesayrs commented Dec 16, 2025

Uh oh!

SunMarc left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MekkCyber left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

kylesayrs commented Dec 17, 2025

Uh oh!

Cyrilvallez commented Dec 18, 2025

Uh oh!

SunMarc commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[Quantization] [Compressed Tensors] Support Transforms, Fix Tests #42887

Are you sure you want to change the base?

[Quantization] [Compressed Tensors] Support Transforms, Fix Tests #42887

Conversation

kylesayrs commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Background

Prerequisites

Changes

Testing

Suggested Reviewers

Uh oh!

Rocketknight1 commented Dec 16, 2025

Uh oh!

kylesayrs commented Dec 16, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MekkCyber left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

github-actions bot commented Dec 17, 2025

Uh oh!

kylesayrs commented Dec 17, 2025

Uh oh!

Cyrilvallez commented Dec 18, 2025

Uh oh!

SunMarc commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

kylesayrs commented Dec 15, 2025 •

edited

Loading