[Auto Parallel] parallel model and parallel optimizer #69443

FeixLiu · 2024-11-18T01:00:53Z

PR Category

Auto Parallel

PR Types

New features

Description

Pcard-73145
parallel model and parallel optimizer

在PaddleNLP中，model的初始化最晚要在amp的decorate之前，也就是Trainer.__init__方法之前。因为amp decorate会真正的进行参数的cast等操作。
在PaddleNLP中，optimizer初始化需要传入lr scheduler，并且lr scheduler的配置的某些分支需要通过checkpoint中的resume step来获取。这就导致optimizer的初始化需要在load checkpoint之后。也就是在trainer.train中完成。
这就导致在PaddleNLP目前的框架体系下，model的初始化与optimizer的初始化位置无法统一，所以中层api需要两个接口分别parallelize model与optimizer。
未来如果PaddleNLP进行更新，譬如将optimizer与lr scheduler解耦，将optimizer的初始化提前到Trainer.__init__之中，中层api也无需使用两个接口分别parallelize model与optimizer。

paddle-bot · 2024-11-18T01:00:57Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

jeff41404

OK, we can provide a demo in this format, and then see how to optimize it later

Single api

92aa5ef

FeixLiu force-pushed the paralllelize_model_and_opt branch from b560449 to 92aa5ef Compare November 18, 2024 01:40

jeff41404 approved these changes Nov 18, 2024

View reviewed changes

jeff41404 merged commit 4bdc30a into PaddlePaddle:develop Nov 18, 2024
27 of 28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Auto Parallel] parallel model and parallel optimizer #69443

[Auto Parallel] parallel model and parallel optimizer #69443

Uh oh!

FeixLiu commented Nov 18, 2024 •

edited

Loading

Uh oh!

paddle-bot bot commented Nov 18, 2024

Uh oh!

jeff41404 left a comment

Uh oh!

Uh oh!

Uh oh!

[Auto Parallel] parallel model and parallel optimizer #69443

[Auto Parallel] parallel model and parallel optimizer #69443

Uh oh!

Conversation

FeixLiu commented Nov 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

Uh oh!

paddle-bot bot commented Nov 18, 2024

Uh oh!

jeff41404 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

FeixLiu commented Nov 18, 2024 •

edited

Loading