[LLM] fix bug when masked_lm_loss is None in llama modeling.py #8348

cqulilujia · 2024-04-29T08:48:48Z

Change-Id: I7b6ba9248e61bee24eb463698af26727394f023a

PR types

Bug fixes

PR changes

llama modeling

Description

如issue #8299 所示，llama模型LlamaPretrainingCriterion类中的masked_lm_loss为空时，无法计算loss = paddle.mean，mean不支持输入shape==[0]，本次修复恢复为之前版本的实现

paddle-bot · 2024-04-29T08:48:53Z

Thanks for your contribution!

Change-Id: I7b6ba9248e61bee24eb463698af26727394f023a

CLAassistant · 2024-04-29T10:30:22Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

lilujia seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

cqulilujia · 2024-04-30T03:48:38Z

还有一种改法是：
masked_lm_loss = masked_lm_loss[masked_lm_loss > 0]
if (masked_lm_loss.shape[0] == 0):
loss = paddle.zeros([], dtype=masked_lm_loss.dtype)
loss.stop_gradient = False
else:
loss = paddle.mean(masked_lm_loss)

Xreki · 2024-05-06T01:40:40Z

还有一种改法是：
masked_lm_loss = masked_lm_loss[masked_lm_loss > 0]
if (masked_lm_loss.shape[0] == 0):
loss = paddle.zeros([], dtype=masked_lm_loss.dtype)
loss.stop_gradient = False
else:
loss = paddle.mean(masked_lm_loss)

这两种改法是等效的吗？能否通过加一些日子把问题数据找出来？

cqulilujia · 2024-05-06T11:05:52Z

还有一种改法是：
masked_lm_loss = masked_lm_loss[masked_lm_loss > 0]
if (masked_lm_loss.shape[0] == 0):
loss = paddle.zeros([], dtype=masked_lm_loss.dtype)
loss.stop_gradient = False
else:
loss = paddle.mean(masked_lm_loss)

这两种改法是等效的吗？能否通过加一些日子把问题数据找出来？

仅从目前我遇到的实例来看两种结果是一致的，从功能来看第二种更符合原始实现，另外我看这个PR #8342 已经修复了这个问题，用的第二种方式

fix bug when masked_lm_loss is None in llama modeling.py

1947fee

Change-Id: I7b6ba9248e61bee24eb463698af26727394f023a

cqulilujia force-pushed the loss branch from caa060d to 1947fee Compare April 29, 2024 10:30

cqulilujia closed this May 8, 2024

cqulilujia mentioned this pull request May 17, 2024

[LLM] fix bug when loss is None in llama modeling.py #8459

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLM] fix bug when masked_lm_loss is None in llama modeling.py #8348

[LLM] fix bug when masked_lm_loss is None in llama modeling.py #8348

cqulilujia commented Apr 29, 2024 •

edited

Loading

paddle-bot bot commented Apr 29, 2024

CLAassistant commented Apr 29, 2024

cqulilujia commented Apr 30, 2024 •

edited

Loading

Xreki commented May 6, 2024 •

edited

Loading

cqulilujia commented May 6, 2024

[LLM] fix bug when masked_lm_loss is None in llama modeling.py #8348

[LLM] fix bug when masked_lm_loss is None in llama modeling.py #8348

Conversation

cqulilujia commented Apr 29, 2024 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Apr 29, 2024

CLAassistant commented Apr 29, 2024

cqulilujia commented Apr 30, 2024 • edited Loading

Xreki commented May 6, 2024 • edited Loading

cqulilujia commented May 6, 2024

cqulilujia commented Apr 29, 2024 •

edited

Loading

cqulilujia commented Apr 30, 2024 •

edited

Loading

Xreki commented May 6, 2024 •

edited

Loading