Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix synchronized memcpy in GPT #7008

Merged
merged 2 commits into from
Sep 19, 2023

Conversation

Wong4j
Copy link
Contributor

@Wong4j Wong4j commented Sep 12, 2023

PR types

Performance optimization

PR changes

Others

Description

Avoid synchronized memcpy in GPT pretraining

@paddle-bot
Copy link

paddle-bot bot commented Sep 12, 2023

Thanks for your contribution!

if loss_mask is None:
loss_mask = (masked_lm_loss > 0).astype("float32")
loss_mask = loss_mask.reshape([-1])
masked_lm_loss = paddle.sum(masked_lm_loss.reshape([-1]) * loss_mask)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support for using custom loss mask?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的改动主要是因为 masked_lm_loss[masked_lm_loss > 0] 的写法会导致D2H的copy。改成loss_mask与lm_loss相乘,得到masked_lm_loss,两种是等效的,但不会有D2H copy。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是 slice op的原因吗?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里可能需要注意下 最后 masked_lm_loss 的数据类型,希望是 float32的

Copy link
Contributor

@Xreki Xreki Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

原实现masked_lm_loss = masked_lm_loss[masked_lm_loss > 0].astype("float32"),返回的masked_lm_loss的shape跟masked_lm_loss > 0比较结果中True的个数有关,因此需要把masked_lm_loss > 0比较结果中True的个数传回CPU,因此需要一个DtoH拷贝。masked_lm_loss[masked_lm_loss > 0]的实现无法避免这个DtoH的。

PR中的修改避开了getitem操作,实现了同样的功能,并且避免了DtoH拷贝。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里可能需要注意下 最后 masked_lm_loss 的数据类型,希望是 float32的

现在的写法应该可以确保是float32吧?

Copy link
Collaborator

@ZHUI ZHUI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov
Copy link

codecov bot commented Sep 19, 2023

Codecov Report

Merging #7008 (2bd5a46) into develop (e49842c) will decrease coverage by 0.17%.
Report is 35 commits behind head on develop.
The diff coverage is 2.62%.

@@             Coverage Diff             @@
##           develop    #7008      +/-   ##
===========================================
- Coverage    60.06%   59.90%   -0.17%     
===========================================
  Files          552      554       +2     
  Lines        81755    81976     +221     
===========================================
  Hits         49105    49105              
- Misses       32650    32871     +221     
Files Changed Coverage Δ
paddlenlp/experimental/transformers/__init__.py 0.00% <0.00%> (ø)
...dlenlp/experimental/transformers/bloom/__init__.py 0.00% <0.00%> (ø)
...dlenlp/experimental/transformers/bloom/modeling.py 0.00% <0.00%> (ø)
...erimental/transformers/fused_transformer_layers.py 0.00% <0.00%> (ø)
...enlp/experimental/transformers/generation_utils.py 0.00% <0.00%> (ø)
paddlenlp/trainer/trainer.py 55.37% <100.00%> (+<0.01%) ⬆️
paddlenlp/transformers/gpt/modeling.py 66.51% <100.00%> (+0.15%) ⬆️

... and 1 file with indirect coverage changes

@ZHUI ZHUI merged commit 5748b69 into PaddlePaddle:develop Sep 19, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants