Skip to content

Conversation

zhangbo9674
Copy link
Contributor

PR types

Performance optimization

PR changes

APIs

Describe

AmpScaler类用于混合精度训练过程中对loss进行缩放,其中成员属性:_found_inf用于标记每轮训练过程中参数梯度是否存在inf。

原本框架代码会在调用check_finite_and_unscaleop通过to_variable申请两个bool类型的tensor,导致每轮训练在该时间存在cudaMemcpy,影响GPU性能:
图片

优化后,将在AmpScaler类初始化过程中声明并定义两个bool类型的tensor,消除训练过程中的cudaMemcpy:
图片

@paddle-bot-old
Copy link

paddle-bot-old bot commented Dec 1, 2021

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhiqiu zhiqiu merged commit cc2b466 into PaddlePaddle:develop Dec 2, 2021
Zjq9409 pushed a commit to Zjq9409/Paddle that referenced this pull request Dec 10, 2021
@zhangbo9674 zhangbo9674 deleted the dev/loss_scaler_found_inf branch March 2, 2023 02:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants