Skip to content

Conversation

@kylesayrs
Copy link
Contributor

@kylesayrs kylesayrs commented Sep 3, 2025

Purpose

  • Support loading models with online transforms applied via Compressed Tensors (LLM Compressor)

Prerequisites

Changes

  • Require a minimum compressed tensors version of 0.11.0 (to support transform features)
  • Load transform configs (if available), and apply them to the model before weight loading
  • (misc) Refactor compressed tensors tests to check for perplexity, rather than exact output matches
  • (misc) Remove update_dtype in order to reduce complexity and give users more control/predictability of model data types

Testing

  • Regression tested using CompressedTensorsTest, added an online quip-style transformed model for testing
    • Perplexity results match expectations

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@Rocketknight1
Copy link
Member

cc @MekkCyber

@kylesayrs kylesayrs marked this pull request as draft September 4, 2025 16:06
@kylesayrs
Copy link
Contributor Author

Putting in draft for now, need to do some more testing

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@github-actions
Copy link
Contributor

github-actions bot commented Oct 1, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: compressed_tensors_integration

@kylesayrs
Copy link
Contributor Author

#42887

@kylesayrs kylesayrs closed this Dec 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants