Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于sft阶段中数据拼接的问题 #23

Open
SymbolZH opened this issue May 12, 2024 · 1 comment
Open

关于sft阶段中数据拼接的问题 #23

SymbolZH opened this issue May 12, 2024 · 1 comment

Comments

@SymbolZH
Copy link

您好,论文3.2中有提到将训练数据随机拼接到4k token的长度,请问是指将sft数据拼接成(q0,a0,q1,a1,...)的形式后只计算answer部分的loss吗?
非常感谢大佬们的工作~

@qianxianyang
Copy link

这里应该类似于预训练时,将batch里每个样本都拼接到上下文长度,从而提升训练的效率。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants