Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add_eager_boxing_and_op_interpreter_dispatch_error_info #5819

Merged

Conversation

clackhan
Copy link
Contributor

@clackhan clackhan commented Aug 10, 2021

eager boxing与op_interpreter dispatch添加错误信息

<< "Eager boxing type \'" << ParallelDistributionToString(in_parallel_distribution)
<< " -> " << ParallelDistributionToString(out_parallel_distribution) << "\'"
<< "not support yet\n"
<< "============ Supported eager boxing type============\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个可以提成一个公共的函数返回?这个 support boxing type 会增加的吧,不用改三处了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已提供公共函数

<< "Got tensors with inconsistent attributes!\n"
<< "op_type_name: " << op_expr.op_type_name() << "\n"
<< "first input tensor: local\n"
<< "secind input tensor: consistent"; // unroll loop for efficiency
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

secind -> second ? 检查一下注释的语法和拼写错误

Copy link
Contributor Author

@clackhan clackhan Aug 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😂手抖,已修改拼写错误

<< "Got tensors with inconsistent attributes!\n"
<< "op_type_name: " << op_expr.op_type_name() << "\n"
<< "first input tensor: consistent\n"
<< "seciod input tensor: local"; // unroll loop for efficiency
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

secind seciod 你这个真是啥拼写都有啊 😂 其实没必要重复写这么多。变成 op_expr , inputs -> ErrorString 的函数就行了,也不需要区分是 first、second,你遍历 inputs,把每个 tensor 是 local 还是 Consistent 都输出出来,说这些不一致,要么全是 local 的,要么全是 Consistent 的。

Copy link
Contributor Author

@clackhan clackhan Aug 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inputs size 小于等于3时,直接判断效率会跟高一点,当大于3时才遍历 inputs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

其实不会影响效率,你把函数调用放在 CHECK 后面的输出流里,当 CHECK 失败的时候才会触发,而不是每次调用都会触发,那么不出错的时候就没有开销(0),出错的时候, 多少开销都不重要了吧

@clackhan clackhan requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 10, 2021 06:37
oneflow-ci-bot and others added 2 commits August 10, 2021 06:51
…add_eager_boxing_and_op_interpreter_dispatch_error_info

Conflicts:
	oneflow/core/framework/op_interpreter/boxing/eager_boxing_interpreter_mgr.cpp
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 12, 2021 08:37
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 12, 2021 10:31
@oneflow-ci-bot oneflow-ci-bot removed their request for review August 12, 2021 12:26
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 12, 2021 12:26
@oneflow-ci-bot oneflow-ci-bot self-requested a review August 12, 2021 15:41
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 12, 2021 17:49
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 12, 2021 19:55
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 12, 2021 21:48
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 12, 2021 23:00
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 13, 2021 00:13
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 13, 2021 01:51
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 136.4ms (= 6818.8ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 127.7ms (= 6385.4ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.07 (= 136.4ms / 127.7ms)

PyTorch resnet50 time: 83.9ms (= 4195.9ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 74.4ms (= 3720.3ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.13 (= 83.9ms / 74.4ms)

PyTorch resnet50 time: 56.7ms (= 2836.0ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 48.1ms (= 2407.2ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 1.18 (= 56.7ms / 48.1ms)

PyTorch resnet50 time: 47.7ms (= 2384.0ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 44.1ms (= 2203.5ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 1.08 (= 47.7ms / 44.1ms)

PyTorch resnet50 time: 43.5ms (= 2176.7ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 39.2ms (= 1959.0ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 1.11 (= 43.5ms / 39.2ms)

@oneflow-ci-bot oneflow-ci-bot merged commit a9fbbbd into master Aug 13, 2021
@oneflow-ci-bot oneflow-ci-bot deleted the add_eager_boxing_and_op_interpreter_dispatch_error_info branch August 13, 2021 02:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants