Skip to content

Support Pangu Pro MoE model #1204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 20, 2025
Merged

Support Pangu Pro MoE model #1204

merged 9 commits into from
Jun 20, 2025

Conversation

Angazenn
Copy link
Contributor

@Angazenn Angazenn commented Jun 13, 2025

What this PR does / why we need it?

Support Pangu Pro MoE model (https://arxiv.org/abs/2505.21411)

Does this PR introduce any user-facing change?

Yes, new model supported

How was this patch tested?

Test locally

@Yikun Yikun changed the title [draft]support pangu [draft]support new moe model Jun 13, 2025
@shen-shanshan shen-shanshan self-assigned this Jun 13, 2025
@shen-shanshan
Copy link
Collaborator

shen-shanshan commented Jun 13, 2025

@Angazenn Thanks for your contribution, please also update the model support doc: https://github.com/vllm-project/vllm-ascend/blob/main/docs/source/user_guide/supported_models.md.

Signed-off-by: angazenn <zengyanjia@huawei.com>
Signed-off-by: angazenn <zengyanjia@huawei.com>
@Angazenn Angazenn force-pushed the pangu branch 2 times, most recently from d28cd5f to 2ad8d0b Compare June 20, 2025 10:09
Signed-off-by: angazenn <zengyanjia@huawei.com>
@shen-shanshan
Copy link
Collaborator

@Angazenn Please modify the model name:

PanGuMoEModel -> PanguProMoEModel
PanGuMoEForCausalLM -> PanguProMoEForCausalLM

The name of opensource config is different.

@Angazenn Angazenn force-pushed the pangu branch 2 times, most recently from ac299f2 to 99df622 Compare June 20, 2025 11:37
Signed-off-by: angazenn <zengyanjia@huawei.com>
@Yikun Yikun mentioned this pull request Jun 20, 2025
29 tasks
@Angazenn Angazenn mentioned this pull request Jun 20, 2025
@Yikun Yikun changed the title [draft]support new moe model Support Pangu Pro MoE model Jun 20, 2025
@Yikun
Copy link
Collaborator

Yikun commented Jun 20, 2025

VLLM_USE_V1=1 vllm serve /root/.cache/pangu-pro-moe-model \
    --tensor-parallel-size 4 \
    --swap-space 16 \
    --disable-log-stats  --disable-log-requests  \
    --trust-remote-code  --enforce-eager

curl http://localhost:8000/v1/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "/root/.cache/pangu-pro-moe-model",
        "prompt": "The future of AI is",
        "max_tokens": 128,
        "temperature": 0
    }'

I do an E2E test in my local env, it works as expected.

cc @ganyi1996ppo @wangxiyuan

@Yikun Yikun merged commit 2f1266d into vllm-project:main Jun 20, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants