Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

interleave_datasets没有控制seed相同 #1218

Closed
gauss-clb opened this issue Oct 18, 2023 · 1 comment
Closed

interleave_datasets没有控制seed相同 #1218

gauss-clb opened this issue Oct 18, 2023 · 1 comment
Labels
solved This problem has been already solved

Comments

@gauss-clb
Copy link

gauss-clb commented Oct 18, 2023

https://github.com/hiyouga/LLaMA-Factory/blob/main/src/llmtuner/dsets/loader.py#L92C16-L92C35

这个会导致不同gpu上的数据条数不一样,这或许是训练过程快结束的时候会卡主的原因?
并且seed不同,很多数据会被重复使用,很多数据可能不会被用到,这和分布式训练切分数据是有矛盾的。

@hiyouga hiyouga added bug Something isn't working pending This problem is yet to be addressed labels Oct 19, 2023
@hiyouga hiyouga added solved This problem has been already solved and removed bug Something isn't working pending This problem is yet to be addressed labels Oct 19, 2023
@hiyouga
Copy link
Owner

hiyouga commented Oct 19, 2023

已修复

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

2 participants