Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLM] fix bug when masked_lm_loss is None in llama modeling.py #8348

Closed
wants to merge 1 commit into from

Conversation

cqulilujia
Copy link
Contributor

@cqulilujia cqulilujia commented Apr 29, 2024

Change-Id: I7b6ba9248e61bee24eb463698af26727394f023a

PR types

Bug fixes

PR changes

llama modeling

Description

如issue #8299 所示,llama模型LlamaPretrainingCriterion类中的masked_lm_loss为空时,无法计算loss = paddle.mean,mean不支持输入shape==[0],本次修复恢复为之前版本的实现

Copy link

paddle-bot bot commented Apr 29, 2024

Thanks for your contribution!

Change-Id: I7b6ba9248e61bee24eb463698af26727394f023a
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


lilujia seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@cqulilujia
Copy link
Contributor Author

cqulilujia commented Apr 30, 2024

还有一种改法是:
masked_lm_loss = masked_lm_loss[masked_lm_loss > 0]
if (masked_lm_loss.shape[0] == 0):
loss = paddle.zeros([], dtype=masked_lm_loss.dtype)
loss.stop_gradient = False
else:
loss = paddle.mean(masked_lm_loss)

@Xreki
Copy link
Contributor

Xreki commented May 6, 2024

还有一种改法是:
masked_lm_loss = masked_lm_loss[masked_lm_loss > 0]
if (masked_lm_loss.shape[0] == 0):
loss = paddle.zeros([], dtype=masked_lm_loss.dtype)
loss.stop_gradient = False
else:
loss = paddle.mean(masked_lm_loss)

这两种改法是等效的吗?能否通过加一些日子把问题数据找出来?

@cqulilujia
Copy link
Contributor Author

还有一种改法是:
masked_lm_loss = masked_lm_loss[masked_lm_loss > 0]
if (masked_lm_loss.shape[0] == 0):
loss = paddle.zeros([], dtype=masked_lm_loss.dtype)
loss.stop_gradient = False
else:
loss = paddle.mean(masked_lm_loss)

这两种改法是等效的吗?能否通过加一些日子把问题数据找出来?

仅从目前我遇到的实例来看两种结果是一致的,从功能来看第二种更符合原始实现,另外我看这个PR #8342 已经修复了这个问题,用的第二种方式

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants