Skip to content

[Dy2St] Optimize range_block_do performance #69834

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

SigureMo
Copy link
Member

@SigureMo SigureMo commented Nov 29, 2024

PR Category

Execute Infrastructure

PR Types

Performance

Description

动转静前反向拆分目前在部分 Program 较大的模型上需要很长时间,测试模型开启组合算子后(前向 17686 个 OP,反向 11558 个 OP,共 29244 个 OP)上需要 34s

目前 range_block_do 每次循环在判断退出条件时,都会跑一次 it != list_offset(block, range[1]),导致这里变成 $O(N^{2})$,因此模型规模越大,就显得越慢

优化此处后前反向拆分在 100ms 内即可完成,基本无感

顺带将 range 类型从 std::vector<int> 改为 std::pair<size_t, size_t>,语义上更明确些

PCard-66972

Copy link

paddle-bot bot commented Nov 29, 2024

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.

Files not reviewed (1)
  • paddle/fluid/pybind/pir.cc: Language not supported

@SigureMo SigureMo closed this Nov 29, 2024
@SigureMo SigureMo reopened this Nov 29, 2024
@SigureMo SigureMo merged commit 239715a into PaddlePaddle:develop Dec 2, 2024
28 of 29 checks passed
@SigureMo SigureMo deleted the dy2st/optimize-range-block-do-performance branch December 2, 2024 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants