[Misc] Update to comply with the new `compressed-tensors` config #5350

dsikka · 2024-06-07T19:20:53Z

Summary

Update to the compressed-tensors quantization method to comply with the new config file structure

tests/quantization/test_compressed_tensors.py

comment

update comment

…m-project#5350) Co-authored-by: Michael Goin <michael@neuralmagic.com>

* upstream/main: (126 commits) [Bugfix][Frontend] Cleanup "fix chat logprobs" (vllm-project#5026) [Bugfix] OpenAI entrypoint limits logprobs while ignoring server defined --max-logprobs (vllm-project#5312) [Misc] Various simplifications and typing fixes (vllm-project#5368) [ci] Fix Buildkite agent path (vllm-project#5392) [Doc] Add documentation for FP8 W8A8 (vllm-project#5388) Bump version to v0.5.0 (vllm-project#5384) [Docs] Alphabetically sort sponsors (vllm-project#5386) [Docs] Add Docs on Limitations of VLM Support (vllm-project#5383) [ci] Mount buildkite agent on Docker container to upload benchmark results (vllm-project#5330) [ci] Use small_cpu_queue for doc build (vllm-project#5331) [Bugfix] Fix LLaVA-NeXT (vllm-project#5380) [Feature][Frontend]: Continued `stream_options` implementation also in CompletionRequest (vllm-project#5319) [Model] Initial support for LLaVA-NeXT (vllm-project#4199) [Misc] Improve error message when LoRA parsing fails (vllm-project#5194) [misc][typo] fix typo (vllm-project#5372) [Frontend][Misc] Enforce Pixel Values as Input Type for VLMs in API Server (vllm-project#5374) [Misc] Update to comply with the new `compressed-tensors` config (vllm-project#5350) [Bugfix] Fix KeyError: 1 When Using LoRA adapters (vllm-project#5164) [Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (vllm-project#5047) [mis][ci/test] fix flaky test in test_sharded_state_loader.py (vllm-project#5361) ...

…m-project#5350) Co-authored-by: Michael Goin <michael@neuralmagic.com>

dsikka added 2 commits June 7, 2024 20:44

update to comply with the new compressed-tensors config

fb1b863

update model names

aec7c82

dsikka force-pushed the update-ct branch from 64157cb to aec7c82 Compare June 7, 2024 20:46

dsikka changed the title ~~update to comply with the new compressed-tensors config~~ [Misc] Update to comply with the new compressed-tensors config Jun 7, 2024

dsikka marked this pull request as ready for review June 7, 2024 20:52

update commit

02a28a9

mgoin approved these changes Jun 7, 2024

View reviewed changes

tests/quantization/test_compressed_tensors.py Outdated Show resolved Hide resolved

remove quantization arg to auto-detect

dbaaa3f

dsikka requested a review from robertgshaw2-redhat June 8, 2024 02:51

mgoin enabled auto-merge (squash) June 8, 2024 13:06

mgoin and others added 3 commits June 8, 2024 10:09

Merge branch 'main' into update-ct

35abb01

Format

a85aa24

run with default settings

41e4ccc

auto-merge was automatically disabled June 8, 2024 16:07
Head branch was pushed to by a user without write access

mgoin enabled auto-merge (squash) June 8, 2024 16:10

robertgshaw2-redhat added 3 commits June 8, 2024 17:12

Merge branch 'vllm-project:main' into update-ct

9d651f8

Update weight_utils.py

b947ef6

comment

Update config.py

e6a61f3

update comment

mgoin merged commit 5884c2b into vllm-project:main Jun 10, 2024

dtrifiro pushed a commit to opendatahub-io/vllm that referenced this pull request Jun 10, 2024

[Misc] Update to comply with the new compressed-tensors config (vll…

6b49414

…m-project#5350) Co-authored-by: Michael Goin <michael@neuralmagic.com>

robertgshaw2-redhat pushed a commit to neuralmagic/nm-vllm that referenced this pull request Jun 11, 2024

[Misc] Update to comply with the new compressed-tensors config (vll…

8f865f6

…m-project#5350) Co-authored-by: Michael Goin <michael@neuralmagic.com>

joerunde pushed a commit to joerunde/vllm that referenced this pull request Jun 17, 2024

[Misc] Update to comply with the new compressed-tensors config (vll…

fbcd007

…m-project#5350) Co-authored-by: Michael Goin <michael@neuralmagic.com>

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jun 27, 2024

[Misc] Update to comply with the new compressed-tensors config (vll…

9a06e44

…m-project#5350) Co-authored-by: Michael Goin <michael@neuralmagic.com>

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 8, 2024

[Misc] Update to comply with the new compressed-tensors config (vll…

b0bf9da

…m-project#5350) Co-authored-by: Michael Goin <michael@neuralmagic.com>

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 24, 2024

[Misc] Update to comply with the new compressed-tensors config (vll…

29a366b

…m-project#5350) Co-authored-by: Michael Goin <michael@neuralmagic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Misc] Update to comply with the new `compressed-tensors` config #5350

[Misc] Update to comply with the new `compressed-tensors` config #5350

Uh oh!

dsikka commented Jun 7, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Misc] Update to comply with the new compressed-tensors config #5350

[Misc] Update to comply with the new compressed-tensors config #5350

Uh oh!

Conversation

dsikka commented Jun 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Uh oh!

Uh oh!

[Misc] Update to comply with the new `compressed-tensors` config #5350

[Misc] Update to comply with the new `compressed-tensors` config #5350

dsikka commented Jun 7, 2024 •

edited

Loading