Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick fix Lazy nn.Graph input/output OpConf.BlobConf.is_dynamic #5767

Merged
merged 5 commits into from
Aug 7, 2021

Conversation

chengtbf
Copy link
Contributor

@chengtbf chengtbf commented Aug 6, 2021

单卡模式下,Lazy 也是 Consistent Graph,所以这时候把 input、output 设置 is_dynamic 标记为 true 会使得系统中一些检查过不去。这个问题的本质原因是 is_dynamic 这个标记是过时的,kernel 不需要通过 blob 的标记来判断自己要不要 infer shape,而是可以通过 parallel num > 1 来决定。未来会移除 is_dynamic 标记。此处先 quick fix 一下让 nn.Graph 构出的 job 的 input output is_dynamic 标记为 False。

@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 6, 2021 05:42
@oneflow-ci-bot oneflow-ci-bot self-requested a review August 6, 2021 08:25
@github-actions
Copy link
Contributor

github-actions bot commented Aug 6, 2021

CI failed, removing label automerge

@github-actions github-actions bot removed the automerge label Aug 6, 2021
@jackalcooper jackalcooper requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 6, 2021 10:52
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot August 6, 2021 14:06
@github-actions
Copy link
Contributor

github-actions bot commented Aug 6, 2021

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 140.2ms (= 7007.9ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 127.7ms (= 6387.3ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.10 (= 140.2ms / 127.7ms)

PyTorch resnet50 time: 83.8ms (= 4189.8ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 74.2ms (= 3708.3ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.13 (= 83.8ms / 74.2ms)

PyTorch resnet50 time: 57.3ms (= 2865.7ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 48.5ms (= 2426.4ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 1.18 (= 57.3ms / 48.5ms)

PyTorch resnet50 time: 48.1ms (= 2406.7ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 38.5ms (= 1923.3ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 1.25 (= 48.1ms / 38.5ms)

PyTorch resnet50 time: 45.9ms (= 2296.2ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 47.9ms (= 2395.0ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 0.96 (= 45.9ms / 47.9ms)

@oneflow-ci-bot oneflow-ci-bot removed their request for review August 6, 2021 19:00
@chengtbf chengtbf merged commit 0f75e01 into master Aug 7, 2021
@chengtbf chengtbf deleted the dev_cc_input_is_dynamic branch August 7, 2021 02:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants