-
-
Notifications
You must be signed in to change notification settings - Fork 8.4k
Add support for MPT #334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for MPT #334
Conversation
Hi, great work! When do you think this PR will be merged? |
@rmihaylov I have this and Bloom (#331) already merged if you want to give it a try: |
@WoosukKwon awesome! Great thanks for this! While playing with it I've stumbled upon strange behavior that might indicate that there is some issue when the beam search is used. When I request:
I get more or less expected answer:
However when I use beam_search:
I get:
I'm not sure but it looks like the answers are corrupted ot intermingled after certain number of tokens (like cumming from different answers?). Interestingly enough the problem manifest only with
|
@emsi Thanks for reporting it! Your beam search output looks very weird. We'll investigate it, but I believe if that is really a bug then the bug should be in our beam search logic, not in the MPT model. So we will investigate it in parallel with this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Left some small comments.
…n wheel uploading (vllm-project#334) 1. Added/updated publish docker workflow into nightly/release workflow. 2. Fixed minor bugs in wheel uploading to GCP due to one wheel changes. 3. Removed duplicate upload code. --------- Co-authored-by: dhuangnm <dhuang@MacBook-Pro-2.local>
fix vllm-project/vllm-ascend#321 This pr is a temporary solution for long seq percision issue, will revert when the root cause is fixed cc @rjg-lyh @wangxiyuan Co-authored-by: rjg-lyh <1318825571@qq.com> --------- Signed-off-by: MengqingCao <cmq0113@163.com> Co-authored-by: wangxiyuan <wangxiyuan@huawei.com>
Closes #218 and #332
Should be merged after #61