Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decompose nd sbp boxing #5800

Merged
merged 133 commits into from
Sep 2, 2021
Merged

Decompose nd sbp boxing #5800

merged 133 commits into from
Sep 2, 2021

Conversation

lixinqi
Copy link
Contributor

@lixinqi lixinqi commented Aug 9, 2021

完成对称placement的nd sbp boxing。
大概实现思路:

  1. 将nd sbp boxing降解为n个1d sbp boxing。
  2. 依次执行这n个1d sbp boxing。

@lixinqi
Copy link
Contributor Author

lixinqi commented Aug 9, 2021

依赖分支 broadcast_consistent_shape和decorator_4_disable_recursive_boxing_call

@lixinqi lixinqi requested a review from guo-ran August 9, 2021 05:47
@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2021

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 139.5ms (= 6976.1ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 126.0ms (= 6300.1ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.11 (= 139.5ms / 126.0ms)

PyTorch resnet50 time: 84.6ms (= 4231.2ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 72.8ms (= 3641.8ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.16 (= 84.6ms / 72.8ms)

PyTorch resnet50 time: 57.6ms (= 2878.3ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 46.2ms (= 2311.9ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 1.24 (= 57.6ms / 46.2ms)

PyTorch resnet50 time: 47.8ms (= 2391.5ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 40.2ms (= 2007.9ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 1.19 (= 47.8ms / 40.2ms)

PyTorch resnet50 time: 41.8ms (= 2089.2ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 40.5ms (= 2023.7ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 1.03 (= 41.8ms / 40.5ms)

@oneflow-ci-bot oneflow-ci-bot removed their request for review August 9, 2021 07:39
@oneflow-ci-bot oneflow-ci-bot removed their request for review September 2, 2021 07:33
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 2, 2021 08:34
@liufengwei0103 liufengwei0103 removed the request for review from oneflow-ci-bot September 2, 2021 08:46
@github-actions
Copy link
Contributor

github-actions bot commented Sep 2, 2021

CI failed, removing label automerge

@github-actions github-actions bot removed the automerge label Sep 2, 2021
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 2, 2021 11:17
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 2, 2021 12:35
@oneflow-ci-bot oneflow-ci-bot self-requested a review September 2, 2021 13:49
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 2, 2021 14:56
@github-actions
Copy link
Contributor

github-actions bot commented Sep 2, 2021

Speed stats:
GPU Name: GeForce GTX 1080 

OneFlow resnet50 time: 125.9ms (= 6294.3ms / 50, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 136.5ms (= 6826.4ms / 50, input_shape=[16, 3, 224, 224])
Relative speed: 1.08 (= 136.5ms / 125.9ms)

OneFlow resnet50 time: 72.8ms (= 3638.0ms / 50, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 82.1ms (= 4105.2ms / 50, input_shape=[8, 3, 224, 224])
Relative speed: 1.13 (= 82.1ms / 72.8ms)

OneFlow resnet50 time: 47.4ms (= 2371.3ms / 50, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 60.9ms (= 3043.3ms / 50, input_shape=[4, 3, 224, 224])
Relative speed: 1.28 (= 60.9ms / 47.4ms)

OneFlow resnet50 time: 39.1ms (= 1955.4ms / 50, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 48.1ms (= 2405.2ms / 50, input_shape=[2, 3, 224, 224])
Relative speed: 1.23 (= 48.1ms / 39.1ms)

OneFlow resnet50 time: 43.5ms (= 2174.2ms / 50, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 43.5ms (= 2176.6ms / 50, input_shape=[1, 3, 224, 224])
Relative speed: 1.00 (= 43.5ms / 43.5ms)

OneFlow resnet50 time: 141.5ms (= 7075.0ms / 50, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 159.0ms (= 7949.6ms / 50, input_shape=[16, 3, 224, 224], ddp, world size=2)
Relative speed: 1.12 (= 159.0ms / 141.5ms)

OneFlow resnet50 time: 91.8ms (= 4589.0ms / 50, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 104.6ms (= 5229.6ms / 50, input_shape=[8, 3, 224, 224], ddp, world size=2)
Relative speed: 1.14 (= 104.6ms / 91.8ms)

OneFlow resnet50 time: 69.5ms (= 3473.3ms / 50, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 75.2ms (= 3760.5ms / 50, input_shape=[4, 3, 224, 224], ddp, world size=2)
Relative speed: 1.08 (= 75.2ms / 69.5ms)

OneFlow resnet50 time: 67.1ms (= 3355.9ms / 50, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 65.0ms (= 3250.9ms / 50, input_shape=[2, 3, 224, 224], ddp, world size=2)
Relative speed: 0.97 (= 65.0ms / 67.1ms)

OneFlow resnet50 time: 59.2ms (= 2959.3ms / 50, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 63.4ms (= 3167.9ms / 50, input_shape=[1, 3, 224, 224], ddp, world size=2)
Relative speed: 1.07 (= 63.4ms / 59.2ms)

@oneflow-ci-bot oneflow-ci-bot removed their request for review September 2, 2021 16:37
@oneflow-ci-bot oneflow-ci-bot merged commit 9c464a3 into master Sep 2, 2021
@oneflow-ci-bot oneflow-ci-bot deleted the decompose_nd_sbp_boxing branch September 2, 2021 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants