block_multihead_attention support V100 GQA #68104

zhink · 2024-09-09T10:57:43Z

PR Category

Inference

PR Types

New features

Description

pcard-71500
在PR67485基础上，block_multihead_attention支持GQA、MQA模型（非MHA模型）运行在Volta、Turing显卡上；综合上一个PR可支持block_multihead_attention全阶段运行在Volta、Turing显卡上

…en sm < 80

paddle-bot · 2024-09-09T10:57:48Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

This reverts commit a601f00.

zhink added 6 commits August 16, 2024 09:37

test v100 no flash

1782350

test v100 no flash 2

8123aaf

test v100 no flash 3

b0a8bc3

Merge commit 'aea5c200a4f665fc8a462f096a6532aa88757e58' into v100

352b7a9

block attention will by variable_length_memory_efficient_attention wh…

4206ceb

…en sm < 80

block attention support encoder stage with GQA when sm < 80

a43e114

zhink and others added 3 commits September 9, 2024 19:02

Merge branch 'develop' into v100

78d03dd

处理冲突

9e05345

codestyle

1d00758

zhink force-pushed the v100 branch from d452709 to 1d00758 Compare September 9, 2024 12:07

zhink changed the title ~~V100~~ block_multihead_attention支持V100显卡GQA模型 Sep 11, 2024

zhink changed the title ~~block_multihead_attention支持V100显卡GQA模型~~ block_multihead_attention support V100 GQA Sep 11, 2024

zhoutianzi666 approved these changes Sep 11, 2024

View reviewed changes

zhoutianzi666 merged commit a601f00 into PaddlePaddle:develop Sep 11, 2024
29 of 30 checks passed

icpcccpc added a commit to icpcccpc/Paddle that referenced this pull request Oct 12, 2024

Revert "block_multihead_attention support V100 GQA (PaddlePaddle#68104)"

c393f03

This reverts commit a601f00.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

block_multihead_attention support V100 GQA #68104

block_multihead_attention support V100 GQA #68104

Uh oh!

zhink commented Sep 9, 2024

Uh oh!

paddle-bot bot commented Sep 9, 2024

Uh oh!

Uh oh!

Uh oh!

block_multihead_attention support V100 GQA #68104

block_multihead_attention support V100 GQA #68104

Uh oh!

Conversation

zhink commented Sep 9, 2024

PR Category

PR Types

Description

Uh oh!

paddle-bot bot commented Sep 9, 2024

Uh oh!

Uh oh!

Uh oh!