-
-
Notifications
You must be signed in to change notification settings - Fork 8.4k
[Model] Add PLaMo2 #14323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model] Add PLaMo2 #14323
Conversation
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com> Co-Authored-By: Kento Nozawa <nzw0301@preferred.jp> Co-Authored-By: Hiroaki Mikami <mhiroaki@preferred.jp>
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com> Co-authored-by: Calvin Metzger <metzger@preferred.jp>
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
514cff5
to
68d3bed
Compare
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
This is my first time submitting a PR to this repository, and I just joined the Slack. I believe I do not have permission to unblock additional CIs, so I would appreciate it if you could add me to the Buildkite org. If I have missed anything or if there are areas for improvement, please do let me know. |
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
Signed-off-by: shemmi <shemmi@preferred.jp>
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
@tlrmchlsmth can you do a quick pass to check that the implementation of this model fits our architecture? No need to test for correctness since they are the model vendor |
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
@DarkLight1337 Thanks for your review. I've updated the tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for forgetting about this, should be good to go now!
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
Head branch was pushed to by a user without write access
Thanks for your approval! |
It seems unrelated part of the tests are still failing. I'll wait main branch to be fixed. |
Can you update models tests in vllm/.buildkite/test-pipeline.yaml Line 396 in 70e7ed8
|
Sure, it passed in my environment, so I thought it was not relevant either. Let me see if other tests pass with the latest main branch at first. |
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com>
@DarkLight1337 |
The problem doesn't appear in V1 so let's just merge this. |
Thanks for the approval! Next, I'll I will submit a follow-up PR to provide support for TP/PP, etc. |
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com> Signed-off-by: shemmi <shemmi@preferred.jp> Co-authored-by: Kento Nozawa <nzw0301@preferred.jp> Co-authored-by: Hiroaki Mikami <mhiroaki@preferred.jp> Co-authored-by: Calvin Metzger <metzger@preferred.jp>
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com> Signed-off-by: shemmi <shemmi@preferred.jp> Co-authored-by: Kento Nozawa <nzw0301@preferred.jp> Co-authored-by: Hiroaki Mikami <mhiroaki@preferred.jp> Co-authored-by: Calvin Metzger <metzger@preferred.jp> Signed-off-by: Yang Wang <elainewy@meta.com>
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com> Signed-off-by: shemmi <shemmi@preferred.jp> Co-authored-by: Kento Nozawa <nzw0301@preferred.jp> Co-authored-by: Hiroaki Mikami <mhiroaki@preferred.jp> Co-authored-by: Calvin Metzger <metzger@preferred.jp>
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com> Signed-off-by: shemmi <shemmi@preferred.jp> Co-authored-by: Kento Nozawa <nzw0301@preferred.jp> Co-authored-by: Hiroaki Mikami <mhiroaki@preferred.jp> Co-authored-by: Calvin Metzger <metzger@preferred.jp>
Signed-off-by: Shinichi Hemmi <50256998+Alnusjaponica@users.noreply.github.com> Signed-off-by: shemmi <shemmi@preferred.jp> Co-authored-by: Kento Nozawa <nzw0301@preferred.jp> Co-authored-by: Hiroaki Mikami <mhiroaki@preferred.jp> Co-authored-by: Calvin Metzger <metzger@preferred.jp> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
This PR adds support for PLaMo2, specifically PLaMo2 1B and PLaMo2 8B. PLaMo2 is a hybrid mamba2 architecture featuring sliding window attention.
In this PR, we have created the inference architecture for the PLaMo2 model.
Please note that some functionalities such as PP/TP, quantization, and LoRA are not yet supported in this PR. We plan to address these features in subsequent follow-up PRs.