Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the KeyError when loading bloom-based models #441

Merged
merged 6 commits into from
Jul 14, 2023

Conversation

HermitSun
Copy link
Contributor

@HermitSun HermitSun commented Jul 12, 2023

As the issue #413 mentioned, a KeyError will be raised when loading some bloom-based models. This is because those bloom-based models provide their lm_head in pretrained weights, which is slightly different from the behavior of original bloom models.

After applying this patch, the following bloom-based models can be accelerated by vllm (tested on A100 with CUDA 11.8):

  1. TigerResearch/tigerbot-7b-sft
  2. sambanovasystems/BLOOMChat-176B-v1

And I believe TigerResearch/tigerbot-180b-research can also be supported after solving some minor problems (issue #440).

Copy link
Collaborator

@WoosukKwon WoosukKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @HermitSun, thanks for submitting the PR! It looks good to me. I didn't consider those types of BLOOM models. TIL.

@WoosukKwon WoosukKwon merged commit dbed690 into vllm-project:main Jul 14, 2023
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Oct 31, 2024
Implementation of multi-step scheduling. To use the feature, pass
--num_scheduler_steps=[n] as a server parameter. In my tests, best
results were achieved with n==64, but this will vary depending on the
model.

---------

Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>
Co-authored-by: jmaksymczuk <jmaksymczuk@habana.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants