-
Notifications
You must be signed in to change notification settings - Fork 825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stateful local kernel supports consistent #5789
Conversation
Signed-off-by: daquexian <daquexian566@gmail.com>
Signed-off-by: daquexian <daquexian566@gmail.com>
oneflow/core/framework/op_interpreter/eager_consistent_op_interpreter.cpp
Outdated
Show resolved
Hide resolved
Signed-off-by: daquexian <daquexian566@gmail.com>
…t_consistent Signed-off-by: daquexian <daquexian566@gmail.com>
0961cde
to
8e1cd8a
Compare
@@ -17,16 +17,15 @@ | |||
import os | |||
import unittest | |||
|
|||
import oneflow.compatible.single_client.unittest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
把 python/oneflow/compatible/single_client/test/ops/test_stateful_local_kernel.py 挪了过来,它应该是 multi client 的测试。并且增加了可以覆盖新实现的接口的测试
|
||
namespace oneflow { | ||
namespace one { | ||
|
||
template<class T> | ||
class InputAndOutputListScope { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个类没什么通用性,直接删掉了
Signed-off-by: daquexian <daquexian566@gmail.com>
8e1cd8a
to
eaad668
Compare
oneflow/core/job/parallel_desc.cpp
Outdated
@@ -183,6 +198,11 @@ Maybe<Symbol<Device>> GetDevice4CurrentProcessCtx(Symbol<ParallelDesc> parallel_ | |||
return device_iter->second; | |||
} | |||
|
|||
std::shared_ptr<ParallelContext> GetParallelContext4CurrentProcessCtx( | |||
Symbol<ParallelDesc> parallel_desc) { | |||
return DECORATE(&RawGetParallelContext4CurrentProcessCtx, ThreadLocalCopiable)(parallel_desc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ThreadLocal。
下意识地应该选择ThreadLocal,它会检查每个参数都是scalar,杜绝潜在的问题。
Signed-off-by: daquexian <daquexian566@gmail.com>
eager consistent op interpreter 调用 stateful local opkernel 时传 consistent tensor meta,作为 sbp, logical shape 等的提供者。修复 logical slice 等 op 在 consistent tensor 上挂掉的问题