-
Notifications
You must be signed in to change notification settings - Fork 825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logical slice ops support all nd_sbp #8313
Conversation
…nto feat-slice_ops_support_2d_sbp
…c/oneflow into feat-slice_ops_support_2d_sbp
Slice update如果不是full slice,value也让只能broadcast,这样是不是logical slice update和slice update就一样的了,只是kernel需要再推导一遍 |
SliceUpdate kernel 内没有实现很复杂的逻辑,只保证了所需要的数据都在本 rank 内,然后把 value 直接复制过去,它不支持 S+B->S 的计算,只能 FullSlice kernel 才能正常工作,所以后来才有了 LogicalSliceAssign。 |
…nto feat-logical_slice_ops_support_all_sbp
CI failed when running job: cpu-module. PR label automerge has been removed |
Speed stats:
|
test_case = <test_consistent_argmin.TestArgmin testMethod=test_argmin> 结果错误,本地没有复现,和本 PR 也没有关系,重跑一下 |
CI failed when running job: cpu-module. PR label automerge has been removed |
Speed stats:
|
Speed stats:
|
…ops_support_all_sbp
28e1124
to
44be540
Compare
还是挂 argmin,重跑一下 |
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8313/ |
LogicalSliceAssign/LogicalSlice 支持所有的 nd_sbp 输入。
LogicalSliceAssign
LogicalSliceAssign 支持的 sbp 组合有(y = logical_slice_assign(ref, slice, value) ):
即保证 value 在所有 rank 上都有所有的数据,在 kernel 里经过推导后,找出需要 copy 的数据块(Boradcast 和 PartialSum 是全拷贝,Split 需要推导)。
LogicalSlice
LogicalSlice 支持的 sbp 组合为( y = logical_slice(x , slice) ):
B->B 和 P->P 比较好理解,就是把 slice 的数据块完整的拷贝。S->P 的逻辑是这样的:给 y 开辟完整的内存空间,初始化值均为 0,然后各 rank 推导 x 在这片内存上所在的 SliceView,对应拷贝就可以了。示意图为:
和 SliceUpdate/SliceOp 的区别
TODO