【Inference PIR】add fused_rotary_position_embedding_pass #65265

lizexu123 · 2024-06-18T11:05:24Z

PR Category

Inference

PR Types

New features

Description

Pcard-71500
增加fused_rotary_position_embedding_pass

A30测试结果:
Llama-7b batch-size:1 no fused_rotary_position_embedding_pass:32857.339ms
Llama-7b batch-size:1 have fused_rotary_position_embedding_pass:31191.201ms
Llama-7b batch-size:4 no fused_rotary_position_embedding_pass:52057.086ms
Llama-7b batch-size:4 have fused_rotary_position_embedding_pass:50541.888ms
在pir 下，llama-7b 模型在 batch_size=1 与batch_size=4 的平均时延方面分别有 5.071% 与 2.911% 的性能提升，其余模型无明显影响；显存方面基本持平

paddle-bot · 2024-06-18T11:05:30Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

yuanlehome · 2024-06-18T11:15:19Z

paddle/fluid/inference/api/paddle_pass_builder.cc

@@ -623,6 +623,7 @@ const std::vector<std::string> kPirGpuPasses{
    "transpose_flatten_concat_fuse_pass",
    "remove_redundant_transpose_pass",
    "transfer_layout_pass",
+    "fused_rotary_position_embedding_pass",


这个pass放在embedding_eltwise_layernorm_fuse_pass前一个

yuanlehome · 2024-06-18T11:16:44Z

paddle/fluid/pir/transforms/gpu/fused_rotary_position_embedding.cc

+
+      auto axis = match_ctx.Attr<std::vector<int64_t>>("full_13_value");
+      auto axis_2 = match_ctx.Attr<std::vector<int64_t>>("full_12_value");
+      return check_axes(axis) && check_axes(axis_2);


这个地方return了，后面的check就不会执行了吧

yuanlehome · 2024-06-18T11:16:53Z

paddle/fluid/pir/transforms/gpu/fused_rotary_position_embedding.cc

+      return check_unsqueeze_axes(unsqueeze_axis) &&
+             check_unsqueeze_axes(unsqueeze_axis_1) &&
+             check_unsqueeze_axes(unsqueeze_axis_2) &&
+             check_unsqueeze_axes(unsqueeze_axis_3);


…#65265) * fused * 前面retrun true后面就不检查了 * 修改return true的后面无法检查 * 删除多余的注释

fused

04b085e

yuanlehome reviewed Jun 18, 2024

View reviewed changes

lizexu123 added 3 commits June 18, 2024 11:40

前面retrun true后面就不检查了

8034b68

修改return true的后面无法检查

75fa9dc

删除多余的注释

5f59c9a

yuanlehome approved these changes Jun 20, 2024

View reviewed changes

yuanlehome merged commit 7688fa8 into PaddlePaddle:develop Jun 20, 2024
32 of 33 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Inference PIR】add fused_rotary_position_embedding_pass #65265

【Inference PIR】add fused_rotary_position_embedding_pass #65265

lizexu123 commented Jun 18, 2024 •

edited

Loading

paddle-bot bot commented Jun 18, 2024

yuanlehome Jun 18, 2024

lizexu123 Jun 18, 2024

yuanlehome Jun 18, 2024

yuanlehome Jun 18, 2024

【Inference PIR】add fused_rotary_position_embedding_pass #65265

【Inference PIR】add fused_rotary_position_embedding_pass #65265

Conversation

lizexu123 commented Jun 18, 2024 • edited Loading

PR Category

PR Types

Description

paddle-bot bot commented Jun 18, 2024

yuanlehome Jun 18, 2024

Choose a reason for hiding this comment

lizexu123 Jun 18, 2024

Choose a reason for hiding this comment

yuanlehome Jun 18, 2024

Choose a reason for hiding this comment

yuanlehome Jun 18, 2024

Choose a reason for hiding this comment

lizexu123 commented Jun 18, 2024 •

edited

Loading