[DCU] new features #63721

yuguo-Jack · 2024-04-21T09:39:05Z

PR Category

Performance Optimization

PR Types

New features

Description

surpport multiclass_nms3 op for DCU（单测通过）
surpport miopen bn for DCU when FLAGS_cudnn_batchnorm_spatial_persistent is 1（test_batch_norm_op/test_batch_norm_op_v2单测通过）
surpport gemm fp16 compute type for DCU when FLAGS_gemm_use_half_precision_compute_type is 1

支持flash attention（mha，gqa前反向，单测通过）
支持block attention相关算子（支持prefix precache，单测通过）
支持a8w8相关算子（单测通过）
支持quant_linear相关算子（单测通过）
支持kv cache int8相关算子（单测通过）
支持weight only量化反量化相关算子（单测通过）
支持fused rope相关算子（单测通过）

… develop

paddle-bot · 2024-04-21T09:39:10Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

… develop

paddle-ci-bot · 2024-05-02T03:17:58Z

Sorry to inform you that 3d36c6f's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

… develop

paddle-ci-bot · 2024-05-29T03:18:18Z

Sorry to inform you that 5326497's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

… develop

qili93

LGTM

XiaoguangHu01 · 2024-06-12T08:32:29Z

paddle/phi/kernels/impl/weight_quantize_kernel_impl.h

@@ -106,7 +106,12 @@ void per_channel_quant(int8_t* output,
                static_cast<float>(current_weight_row[input_idx]);
            const float scaled_weight = round(weight_elt / col_scale);
            int int_weight = static_cast<int>(scaled_weight);
+#ifdef PADDLE_WITH_HIP
+            const int8_t clipped_weight =
+                std::max(-7, std::min(7, int_weight)) + 8;


这里需要+8的原因是什么？

因为int4反量化时会将高低四位按uint8解析，所以量化时将数值移至1-15，反量化再减去，会方便int4反量化kernel的实现，否则判断符号位这类操作会产生线程束分化。因为weight only在dcu上没有优化强转，所以weight only量化反量化流程和nv有较大不同。

XiaoguangHu01 · 2024-06-12T08:54:21Z

paddle/phi/kernels/gpu/flash_attn_grad_kernel.cu

@@ -255,25 +255,42 @@ void FlashAttnUnpaddedGradBaseKernel(
    kdq = &dq_tmp;
  }

+#ifdef PADDLE_WITH_HIP
+  std::initializer_list<int64_t> dk_dv_input_shape = {


这里dk_dv_input_shape和dk_dv_shape的区别是什么，形状不相同的原因是什么？
是否可以用相同的变量名？

gqa逻辑是参考我们适配的fa 2.0.4的cpp接口进行的修改，以定长接口为例：

已经按照建议修复

qili93

LGTM

XiaoguangHu01

LGTM

This reverts commit c66533f.

yuguo-Jack added 16 commits April 3, 2024 14:57

[DCU] fix bugs and surpport some fused ops

a679c73

[DCU] fix a small bug

0631c13

Update fused_dropout_act_bias.h

e925661

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

1aff5a0

… develop

update fused_dropout_act_bias.h

55aea8f

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

6acd01c

… develop

fix depthwise conv grad op bug

42aa9bf

fix hip graph test bugs

490a0d3

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

81207d0

… develop

update

03e9c0a

fix hip graph dropout bug

ea704f3

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

d2d143f

… develop

code style

3e66860

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

597e660

… develop

[DCU] new features

6a42f01

[DCU] surpport miopen BN

b0079f0

paddle-bot bot added the contributor External developers label Apr 21, 2024

yuguo-Jack added 2 commits April 22, 2024 12:09

fix miopen bn bugs

43a401e

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

3d36c6f

… develop

yuguo-Jack added 2 commits May 14, 2024 20:07

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

c75c13c

… develop

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

5326497

… develop

yuguo-Jack added 6 commits June 11, 2024 13:09

[DCU] high performance LLM train and inference for DCU

ed36550

merge develop and solve conflicts

c74703c

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

bf0a55b

… develop

fix conflict files format

cf1e743

fix redefinition of FastGeluFunctor

ae2ff95

fix small bugs

15e6938

qili93 previously approved these changes Jun 12, 2024

View reviewed changes

qili93 requested a review from XiaoguangHu01 June 12, 2024 07:53

XiaoguangHu01 reviewed Jun 12, 2024

View reviewed changes

fix a problem

0e35e8f

yuguo-Jack dismissed qili93’s stale review via 0e35e8f June 12, 2024 11:12

qili93 approved these changes Jun 13, 2024

View reviewed changes

XiaoguangHu01 approved these changes Jun 13, 2024

View reviewed changes

qili93 merged commit c66533f into PaddlePaddle:develop Jun 13, 2024
30 of 33 checks passed

qili93 added a commit that referenced this pull request Jun 13, 2024

Revert "[DCU] new features (#63721)"

94029f3

This reverts commit c66533f.

qili93 mentioned this pull request Jun 13, 2024

Revert "[DCU] new features" #65140

Merged

yuanlehome pushed a commit that referenced this pull request Jun 14, 2024

Revert "[DCU] new features (#63721)" (#65140)

f4dcc41

This reverts commit c66533f.

qili93 mentioned this pull request Jun 25, 2024

[DCU] fix fused_bias_act op #65399

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DCU] new features #63721

[DCU] new features #63721

yuguo-Jack commented Apr 21, 2024 •

edited

Loading

paddle-bot bot commented Apr 21, 2024

paddle-ci-bot bot commented May 2, 2024

paddle-ci-bot bot commented May 29, 2024

qili93 left a comment

XiaoguangHu01 Jun 12, 2024

yuguo-Jack Jun 12, 2024

XiaoguangHu01 Jun 12, 2024

yuguo-Jack Jun 12, 2024 •

edited

Loading

yuguo-Jack Jun 12, 2024

qili93 left a comment

XiaoguangHu01 left a comment

[DCU] new features #63721

[DCU] new features #63721

Conversation

yuguo-Jack commented Apr 21, 2024 • edited Loading

PR Category

PR Types

Description

paddle-bot bot commented Apr 21, 2024

paddle-ci-bot bot commented May 2, 2024

paddle-ci-bot bot commented May 29, 2024

qili93 left a comment

Choose a reason for hiding this comment

XiaoguangHu01 Jun 12, 2024

Choose a reason for hiding this comment

yuguo-Jack Jun 12, 2024

Choose a reason for hiding this comment

XiaoguangHu01 Jun 12, 2024

Choose a reason for hiding this comment

yuguo-Jack Jun 12, 2024 • edited Loading

Choose a reason for hiding this comment

yuguo-Jack Jun 12, 2024

Choose a reason for hiding this comment

qili93 left a comment

Choose a reason for hiding this comment

XiaoguangHu01 left a comment

Choose a reason for hiding this comment

yuguo-Jack commented Apr 21, 2024 •

edited

Loading

yuguo-Jack Jun 12, 2024 •

edited

Loading