[LLM Inference] Support Qwen2_Moe Inference with MultiGPU #9121

CJ77Qi · 2024-09-11T02:59:31Z

PR types

New features

PR changes

Models

Description

支持Qwen/Qwen2-57B-A14B 多卡推理

paddle-bot · 2024-09-11T02:59:36Z

Thanks for your contribution!

codecov · 2024-09-11T03:35:42Z

Codecov Report

Attention: Patch coverage is 0% with 39 lines in your changes missing coverage. Please review.

Project coverage is 53.32%. Comparing base (51b54d2) to head (4b10081).
Report is 1 commits behind head on develop.

Files with missing lines	Patch %	Lines
...lp/experimental/transformers/qwen2_moe/modeling.py	0.00%	30 Missing ⚠️
...erimental/transformers/fused_transformer_layers.py	0.00%	9 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #9121      +/-   ##
===========================================
- Coverage    53.33%   53.32%   -0.02%     
===========================================
  Files          652      652              
  Lines       105404   105436      +32     
===========================================
+ Hits         56222    56225       +3     
- Misses       49182    49211      +29

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

yuanlehome · 2024-09-12T05:04:17Z

paddlenlp/experimental/transformers/qwen2_moe/modeling.py

@@ -832,26 +866,34 @@ def get_tensor_parallel_split_mappings(num_layers):
                # Row Linear
                "embed_tokens.weight": partial(fn, is_column=False),
                "layers.0.self_attn.o_proj.weight": partial(fn, is_column=False),
-                "layers.0.mlp.down_proj.weight": partial(fn, is_column=False),
+                # "layers.0.mlp.down_proj.weight": partial(fn, is_column=False),


这条注释给删掉吧

Support Qwen2-57B-A14B

c8eae64

Fix codestyle

d50c5f4

fix

0eb1eae

yuanlehome reviewed Sep 12, 2024

View reviewed changes

CJ77Qi added 2 commits September 12, 2024 05:09

merge

3629fca

fix review

4b10081

qingqing01 approved these changes Sep 12, 2024

View reviewed changes

qingqing01 merged commit d3302c5 into PaddlePaddle:develop Sep 12, 2024
5 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLM Inference] Support Qwen2_Moe Inference with MultiGPU #9121

[LLM Inference] Support Qwen2_Moe Inference with MultiGPU #9121

CJ77Qi commented Sep 11, 2024

paddle-bot bot commented Sep 11, 2024

codecov bot commented Sep 11, 2024 •

edited

Loading

yuanlehome Sep 12, 2024

[LLM Inference] Support Qwen2_Moe Inference with MultiGPU #9121

[LLM Inference] Support Qwen2_Moe Inference with MultiGPU #9121

Conversation

CJ77Qi commented Sep 11, 2024

PR types

PR changes

Description

paddle-bot bot commented Sep 11, 2024

codecov bot commented Sep 11, 2024 • edited Loading

Codecov Report

yuanlehome Sep 12, 2024

Choose a reason for hiding this comment

codecov bot commented Sep 11, 2024 •

edited

Loading