-
Notifications
You must be signed in to change notification settings - Fork 5.7k
support DataLoader with multi-process mode on MacOs and Windows basically #35854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
❌ The PR is not created using PR's template. You can refer to this Demo. |
Thanks for your contribution! |
Sorry to inform you that 109ca9d's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually. |
Sorry for too late to notice this PR, could you resolve the conflict files to pass the CI first? @wwqgtxx |
DataLoader unittest is disabled for Mac/Windows currently, could you please remove these lines to reopen DataLoader unites for Mac/Windows to run unitests CI
|
Do you try |
In fact, when trying to use |
@wwqgtxx Discussed with @heavengate:
|
or sys.platform == 'win32'): | ||
warnings.warn( | ||
"DataLoader with multi-process mode is not fully supported on MacOs and Windows currently." \ | ||
" Please use multi-process mode with use_shared_memory = False instead") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wwqgtxx
现在CI还有多个单测失败,可以继续修一下么?比如这段就会触发test_dataloader_autotune
失败
https://xly.bce.baidu.com/paddlepaddle/paddle/newipipe/detail/6548861/job/18199166
2022-09-06 16:54:46 C:\home\workspace\Paddle\build\python\paddle\fluid\reader.py:492: UserWarning: DataLoader with multi-process mode is not fully supported on MacOs and Windows currently. Please use multi-process mode with use_shared_memory = False instead
2022-09-06 16:54:46 "DataLoader with multi-process mode is not fully supported on MacOs and Windows currently." \
2022-09-06 16:54:46 test_dataloader_autotune failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
其实我也看过ci的日志,大部分单测失败都是因为某个class不能被pickle,就像之前提到的Tensor一样,这个问题可能是因为paddle的一些c/c++代码实现的基础类型就是不能被pickle导致的。
个人感觉作为一个basically
的支持,目前的实现也勉强够用(我自己是用来跑PaddleDetection的训练,没遇到过什么错误)
如果要完全解决上述问题,应该要为那些native类型补充缺失的pickle相关接口,这个工程量就相对较大了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果要完全解决上述问题,应该要为那些native类型补充缺失的pickle相关接口,这个工程量就相对较大了
- 确实这个工程量相对比较大了,如果有兴趣的话,可以帮我们来修复,非常欢迎。
- 我们正在运营一个Paddle Framework Contributor Club (PFCC) 组织,会通过定期分享技术知识与发布开发者主导任务的形式持续为飞桨框架做贡献,详情可见 https://github.com/luotao1 主页说明。
- 如何加入:可以提交 PR 至 Paddle,代码合入后,我们会邀请你加入 PFCC。非常期待你的加入~
来自飞桨PFCC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/PaddlePaddle/Paddle/pull/47025/files https://github.com/PaddlePaddle/Paddle/pull/48179/files
这个pr将 multiprocessing 修改为默认引入。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#37302 这个pr是支持 multiprocessing 支持的pr,可以根据这个修改一些支持一下 mac or win 平台。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
很抱歉,经过我们的反复讨论,你的PR暂未达到合入标准,请阅读飞桨原生算子开发规范,你可以重新提交新的PR,我们先将此PR关闭,感谢你的贡献。 |
很抱歉,经过我们的反复讨论,你的PR暂未达到合入标准,请阅读飞桨原生算子开发规范,你可以重新提交新的PR,我们先将此PR关闭,感谢你的贡献。 |
PR types
New features
PR changes
APIs
Describe
After reading code, I found we can simply comment out some platform-related code to make DataLoader with multi-process mode work on MacOs and Windows.