Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Misc] Update to comply with the new compressed-tensors config #5350

Merged
merged 10 commits into from
Jun 10, 2024

Conversation

dsikka
Copy link
Contributor

@dsikka dsikka commented Jun 7, 2024

Summary

  • Update to the compressed-tensors quantization method to comply with the new config file structure

@dsikka dsikka changed the title update to comply with the new compressed-tensors config [Misc] Update to comply with the new compressed-tensors config Jun 7, 2024
@dsikka dsikka marked this pull request as ready for review June 7, 2024 20:52
tests/quantization/test_compressed_tensors.py Outdated Show resolved Hide resolved
@mgoin mgoin enabled auto-merge (squash) June 8, 2024 13:06
auto-merge was automatically disabled June 8, 2024 16:07

Head branch was pushed to by a user without write access

@mgoin mgoin enabled auto-merge (squash) June 8, 2024 16:10
@mgoin mgoin merged commit 5884c2b into vllm-project:main Jun 10, 2024
101 of 103 checks passed
dtrifiro pushed a commit to opendatahub-io/vllm that referenced this pull request Jun 10, 2024
robertgshaw2-neuralmagic pushed a commit to neuralmagic/nm-vllm that referenced this pull request Jun 11, 2024
tjohnson31415 added a commit to tjohnson31415/vllm that referenced this pull request Jun 11, 2024
* upstream/main: (126 commits)
  [Bugfix][Frontend] Cleanup "fix chat logprobs" (vllm-project#5026)
  [Bugfix] OpenAI entrypoint limits logprobs while ignoring server defined --max-logprobs (vllm-project#5312)
  [Misc] Various simplifications and typing fixes (vllm-project#5368)
  [ci] Fix Buildkite agent path (vllm-project#5392)
  [Doc] Add documentation for FP8 W8A8 (vllm-project#5388)
  Bump version to v0.5.0 (vllm-project#5384)
  [Docs] Alphabetically sort sponsors (vllm-project#5386)
  [Docs] Add Docs on Limitations of VLM Support (vllm-project#5383)
  [ci] Mount buildkite agent on Docker container to upload benchmark results (vllm-project#5330)
  [ci] Use small_cpu_queue for doc build (vllm-project#5331)
  [Bugfix] Fix LLaVA-NeXT (vllm-project#5380)
  [Feature][Frontend]:  Continued `stream_options` implementation also in CompletionRequest (vllm-project#5319)
  [Model] Initial support for LLaVA-NeXT (vllm-project#4199)
  [Misc] Improve error message when LoRA parsing fails (vllm-project#5194)
  [misc][typo] fix typo (vllm-project#5372)
  [Frontend][Misc] Enforce Pixel Values as Input Type for VLMs in API Server (vllm-project#5374)
  [Misc] Update to comply with the new `compressed-tensors` config (vllm-project#5350)
  [Bugfix] Fix KeyError: 1 When Using LoRA adapters (vllm-project#5164)
  [Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (vllm-project#5047)
  [mis][ci/test] fix flaky test in test_sharded_state_loader.py (vllm-project#5361)
  ...
joerunde pushed a commit to joerunde/vllm that referenced this pull request Jun 17, 2024
xjpang pushed a commit to xjpang/vllm that referenced this pull request Jun 27, 2024
…m-project#5350)

Co-authored-by: Michael Goin <michael@neuralmagic.com>
xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 8, 2024
…m-project#5350)

Co-authored-by: Michael Goin <michael@neuralmagic.com>
xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 24, 2024
…m-project#5350)

Co-authored-by: Michael Goin <michael@neuralmagic.com>
Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024
…m-project#5350)

Co-authored-by: Michael Goin <michael@neuralmagic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants