-
Notifications
You must be signed in to change notification settings - Fork 825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NaiveB2PSubTskGphBuilder #3942
Conversation
piece_id_ += 1; | ||
} | ||
|
||
int64_t piece_id_ = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
去掉 = 0,因为这个特性其实没有得到足够的支持,这么写了会让人误以为不用再初始化。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
去掉 = 0,因为这个特性其实没有得到足够的支持,这么写了会让人误以为不用再初始化。
已修改
|
||
LogicalBlobId lbi_; | ||
Shape shape_; | ||
DataType data_type_ = DataType::kInvalidDataType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同理
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同理
已修改
int64_t piece_id_; | ||
}; | ||
|
||
REGISTER_ACTOR(TaskType::kBoxingZeros, BoxingZerosActor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是不是直接注册成NaiveActor就行
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是不是直接注册成NaiveActor就行
这个是source actor,用naive有个piece_id的问题
CompTaskNode* dst_node = sorted_dst_comp_tasks.at(dst_node_idx); | ||
const int64_t nearest_src_node_idx = | ||
SubTskGphBuilderUtil::FindNearestNodeIndex(sorted_src_comp_tasks, dst_node); | ||
CHECK_NE(nearest_src_node_idx, -1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议CHECK_NE_OR_RETURN,能不崩溃就不崩溃。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议CHECK_NE_OR_RETURN,能不崩溃就不崩溃。
已修改
SubTskGphBuilderUtil::FindNearestNodeIndex(sorted_src_comp_tasks, dst_node); | ||
CHECK_NE(nearest_src_node_idx, -1); | ||
CompTaskNode* nearest_src_node = sorted_src_comp_tasks.at(nearest_src_node_idx); | ||
dst_node2nearest_src_node.emplace(dst_node, nearest_src_node); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里可以随手CHECK_OR_RETURN(.second)吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里可以随手CHECK_OR_RETURN(.second)吗?
已修改
return Maybe<void>::Ok(); | ||
} | ||
|
||
REGISTER_OP(OperatorConf::kBoxingZerosConf, BoxingZerosOp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要像constant_op一样标记内存不能复用。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要像constant_op一样标记内存不能复用。
这个op是由BoxingZerosTaskNode new出来的,BoxingZerosTaskNode已经标记了内存不参与复用,标记内存复用是NormalForward的逻辑吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这样有点依赖这个上下文,op/kernel应该自洽才对。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这样有点依赖这个上下文,op/kernel应该自洽才对。
只写一次的逻辑由kernel迁移到了actor控制,这样task/actor是对应的,op/kernel无需关系内存是否复用
const KernelCtx& ctx, std::function<Blob*(const std::string&)> BnInOp2Blob) const { | ||
if (!inited_) { | ||
Blob* out = BnInOp2Blob("out"); | ||
Memset<device_type>(ctx.device_ctx, out->mut_dptr(), 0, out->ByteSizeOfBlobBody()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果memset性能很高,是不是把它当普通kernel了,这样也能它和别人内存共享。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果memset性能很高,是不是把它当普通kernel了,这样也能它和别人内存共享。
目前的场景中,少一次kernel luanch对性能很关键,这里未来可以根据size做优化策略
|
||
namespace oneflow { | ||
|
||
class BoxingZerosOp : public Operator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不直接用zero_like是因为考虑到这种boxing不一定是121吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不直接用zero_like是因为考虑到这种boxing不一定是121吗?
是的
* Add NaiveB2PSubTskGphBuilder * refine * refine * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Former-commit-id: 9f504df
添加B->P的boxing支持,方法是找到上下游之间距离最近的一对节点,下游节点消费上游的数据;其他下游节点以同样形状的0作为输入,并通过ctrl连接到上游。