-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
[TPU] Enable gemma3-27b with TP>1 on multi-chips. #17335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TPU] Enable gemma3-27b with TP>1 on multi-chips. #17335
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
tests/v1/tpu/test_basic.py
Outdated
|
||
@pytest.mark.skipif(not current_platform.is_tpu(), | ||
reason="This is a basic test for TPU only") | ||
def test_gemma3_with_mm_on_multichip( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test name indicates there's MM, could you help point out where's it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gemma-3-27b-it, unlike gemma-3-1b-it, is a multimodal model (is_multimodal_model=True). Even if the input is text, "always use embeddings (rather than token ids) as input to the multimodal model, even when the input is text" per the comment. That's where the mm comes from.
I can see your concern. Let me make the test name clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
d4e2ce2
to
81fbdc9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks Xiongfei for supporting such an important model!
81fbdc9
to
282517c
Compare
c638a28
to
7bce06f
Compare
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
5bdec58
to
22c0481
Compare
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
The failing CI seems to have the symptom of timeout. I don't see why my PR would cause that. |
I retried the failing tests, but I think we can merge ignoring those timeouts |
Thanks @mgoin . I also did some check on my a100 VM. For the 2 failing tests:
Could you help merge the PR? Thanks! |
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks cleaner, thanks!
Nice improvement and TPU V1 test is green! |
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com> Signed-off-by: Yuqi Zhang <yuqizhang@google.com>
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com> Signed-off-by: minpeter <kali2005611@gmail.com>
This PR enables gemma3-27b with TP>1 on multi-chips. Without the change, it fails with an error:
Test plan: pytest -s -vv tests/v1/tpu/test_basic.py -k test_gemma3_27b_with_text_input_and_tp 2>&1 | tee ~/out.txt