Optimizing the performance of fused_layer_norm and top_p_sampling operators #65711

yuanlehome · 2024-07-04T08:26:37Z

PR Category

Inference

PR Types

Performance

Description

pcard-71500
Optimizing the performance of fused_layer_norm and top_p_sampling operators.

paddle-bot · 2024-07-04T08:26:41Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

YanhuiDua · 2024-07-04T12:47:22Z

paddle/phi/kernels/gpu/top_p_sampling_kernel.cu

-      float random_ratio = exponential_transform(curand_uniform(&state), 1.0f);
+      float random_ratio =
+          exponential_transform(curand_uniform(states + bid), 1.0f);
 #endif


这里也可以用GPU(rand_uniform)包起来

YanhuiDua · 2024-07-04T12:48:15Z

paddle/phi/kernels/gpu/top_p_sampling_kernel.cu

+  for (int i = idx; i < bs; i += gridDim.x * blockDim.x) {
+    if (need_batch_random) {
+#ifdef PADDLE_WITH_HIP
+      hiprand_init(seed, i, offset, &state[i]);


函数也可以用宏定义：GPU(rand_init)

YanhuiDua · 2024-07-31T03:33:53Z

paddle/phi/kernels/gpu/top_p_sampling_kernel.cu

-          PD_THROW(
-              "the input data shape has error in the topp_beam_topk kernel.");
-      }
+  if (mode == "truncate") {


需要对应修改search.py的python接口的默认mode 为“truncate”

optim fused_layer_norm and top_p_sampling

875fb27

yuanlehome added 4 commits July 4, 2024 08:43

update

c36e047

update

145e7f4

update

a1b5277

support hip

f96ba27

yuanlehome changed the title ~~optim fused_layer_norm and top_p_sampling~~ Optimizing the performance of fused_layer_norm and top_p_sampling operators Jul 4, 2024

YanhuiDua reviewed Jul 4, 2024

View reviewed changes

yuanlehome added 2 commits July 5, 2024 02:50

fix comment

ddf44dc

update

824c1cb

carryyu approved these changes Jul 5, 2024

View reviewed changes

yuanlehome merged commit a14bb2f into PaddlePaddle:develop Jul 5, 2024

yuanlehome mentioned this pull request Jul 8, 2024

Revert "Optimizing the performance of fused_layer_norm and top_p_sampling operators" #65800

Closed

YanhuiDua reviewed Jul 31, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimizing the performance of fused_layer_norm and top_p_sampling operators #65711

Optimizing the performance of fused_layer_norm and top_p_sampling operators #65711

Uh oh!

yuanlehome commented Jul 4, 2024 •

edited

Loading

Uh oh!

paddle-bot bot commented Jul 4, 2024

Uh oh!

YanhuiDua Jul 4, 2024

Uh oh!

yuanlehome Jul 5, 2024

Uh oh!

YanhuiDua Jul 4, 2024

Uh oh!

yuanlehome Jul 5, 2024

Uh oh!

YanhuiDua Jul 31, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Optimizing the performance of fused_layer_norm and top_p_sampling operators #65711

Optimizing the performance of fused_layer_norm and top_p_sampling operators #65711

Uh oh!

Conversation

yuanlehome commented Jul 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

Uh oh!

paddle-bot bot commented Jul 4, 2024

Uh oh!

YanhuiDua Jul 4, 2024

Choose a reason for hiding this comment

Uh oh!

yuanlehome Jul 5, 2024

Choose a reason for hiding this comment

Uh oh!

YanhuiDua Jul 4, 2024

Choose a reason for hiding this comment

Uh oh!

yuanlehome Jul 5, 2024

Choose a reason for hiding this comment

Uh oh!

YanhuiDua Jul 31, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yuanlehome commented Jul 4, 2024 •

edited

Loading