Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HFQuantizer implementation for compressed-tensors library #31704

Merged
merged 42 commits into from
Sep 25, 2024
Merged
Changes from 1 commit
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
d695ec3
Add compressed-tensors HFQuantizer implementation
Jun 5, 2024
f468964
flag serializable as False
Jun 5, 2024
41224d3
run
horheynm Jun 10, 2024
b61bfb9
revive lines deleted by ruff
horheynm Jun 10, 2024
ff8f1c5
fixes to load+save from sparseml, edit config to quantization_config,…
horheynm Jun 11, 2024
c1cb55d
address satrat comment
horheynm Jun 11, 2024
ef9d3f1
compressed_tensors to compressed-tensors and revert back is_serializable
horheynm Jun 12, 2024
117d050
rename quant_method from sparseml to compressed-tensors
horheynm Jun 12, 2024
1901c3e
tests
horheynm Jun 12, 2024
3ca270d
edit tests
horheynm Jun 13, 2024
9a14b09
clean up tests
Jun 28, 2024
ec59052
make style
Jun 28, 2024
520ded8
cleanup
Jun 28, 2024
7dec8fc
cleanup
Jun 28, 2024
afb550d
Merge branch 'main' into compressed-tensors-quantizer
bfineran Jul 25, 2024
d9b3660
add test skip for when compressed tensors is not installed
Jul 25, 2024
e51ac59
remove pydantic import + style
Jul 25, 2024
ccb5442
delay torch import in test
Jul 25, 2024
bfd9220
initial docs
Jul 30, 2024
71a80f9
update main init for compressed tensors config
Jul 30, 2024
547f9cc
make fix-copies
Jul 30, 2024
8acbc09
docstring
Jul 31, 2024
eaa5f20
remove fill_docstring
Jul 31, 2024
4ba75fb
Apply suggestions from code review
bfineran Aug 6, 2024
94ea0d3
review comments
Aug 6, 2024
c48840d
review comments
Aug 6, 2024
ab74d26
Merge branch 'main' into compressed-tensors-quantizer
bfineran Aug 19, 2024
2ecf711
comments - suppress warnings on state dict load, tests, fixes
Aug 20, 2024
e1ae504
bug-fix - remove unnecessary call to apply quant lifecycle
Aug 22, 2024
ea9e927
run_compressed compatability
Aug 30, 2024
1c3ad5c
revert changes not needed for compression
Sep 3, 2024
aa1a4f9
no longer need unexpected keys fn
Sep 3, 2024
81a13dd
unexpected keys not needed either
Sep 3, 2024
f53d7b9
Apply suggestions from code review
Satrat Sep 9, 2024
d8f7073
add to_diff_dict
Sep 9, 2024
c4fbf70
update docs and expand testing
Sep 11, 2024
1992a88
Merge remote-tracking branch 'upstream/main' into compressed-tensors-…
Sep 17, 2024
298a638
Update _toctree.yml with compressed-tensors
Satrat Sep 18, 2024
3cb4415
Update src/transformers/utils/quantization_config.py
Satrat Sep 23, 2024
a943157
Merge branch 'main' into compressed-tensors-quantizer
dsikka Sep 24, 2024
64f475a
update doc
dsikka Sep 24, 2024
fabe8a3
add note about saving a loaded model
dsikka Sep 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Merge branch 'main' into compressed-tensors-quantizer
  • Loading branch information
bfineran authored Aug 19, 2024
commit ab74d26cfcdddadb4e6a8e1305b0d53084f43cff
4 changes: 4 additions & 0 deletions docs/source/en/main_classes/quantization.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,3 +64,7 @@ Learn how to quantize models in the [Quantization](../quantization) guide.
## CompressedTensorsConfig

[[autodoc]] CompressedTensorsConfig

## TorchAoConfig

[[autodoc]] TorchAoConfig
You are viewing a condensed version of this merge commit. You can view the full changes here.