Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix masked select broadcast bug #7984

Merged
merged 37 commits into from
Apr 13, 2022
Merged

Fix masked select broadcast bug #7984

merged 37 commits into from
Apr 13, 2022

Conversation

BBuf
Copy link
Contributor

@BBuf BBuf commented Apr 8, 2022

eager下masked_select不支持某些广播的问题。

close #7983

@@ -42,9 +42,6 @@ def masked_select_op(input, mask):
tensor([0.3139, 0.3898], dtype=oneflow.float32)
"""

assert len(input.shape) == len(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个check是不正确的,在下方提供的例子中mask的shape是(3, 3),但这种情况也可以广播,并且广播的check逻辑在mul op里面有了,所以这个check是错误并且不必要的。

@oneflow-ci-bot oneflow-ci-bot self-requested a review April 10, 2022 03:28
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot April 10, 2022 05:47
@BBuf BBuf requested review from oneflow-ci-bot and removed request for oneflow-ci-bot April 10, 2022 15:43
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot April 10, 2022 17:21
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot April 10, 2022 19:19
@github-actions
Copy link
Contributor

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/7984/

@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

✔️ OneFlow resnet50 time: 128.8ms (= 12882.9ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 141.2ms (= 14120.7ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.10 (= 141.2ms / 128.8ms)

OneFlow resnet50 time: 79.4ms (= 7942.9ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 83.3ms (= 8327.5ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.05 (= 83.3ms / 79.4ms)

OneFlow resnet50 time: 52.7ms (= 10537.4ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 62.0ms (= 12392.0ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.18 (= 62.0ms / 52.7ms)

OneFlow resnet50 time: 42.7ms (= 8546.7ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 45.7ms (= 9142.3ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.07 (= 45.7ms / 42.7ms)

OneFlow resnet50 time: 35.4ms (= 7071.9ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 39.6ms (= 7915.8ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.12 (= 39.6ms / 35.4ms)

OneFlow swin dataloader time: 0.270s (= 54.095s / 200, num_workers=1)
PyTorch swin dataloader time: 0.249s (= 49.755s / 200, num_workers=1)
✔️ Relative speed: 0.920 (= 0.249s / 0.270s)

OneFlow swin dataloader time: 0.067s (= 13.476s / 200, num_workers=4)
PyTorch swin dataloader time: 0.067s (= 13.489s / 200, num_workers=4)
✔️ Relative speed: 1.001 (= 0.067s / 0.067s)

OneFlow swin dataloader time: 0.037s (= 7.411s / 200, num_workers=8)
PyTorch swin dataloader time: 0.036s (= 7.281s / 200, num_workers=8)
✔️ Relative speed: 0.983 (= 0.036s / 0.037s)

✔️ OneFlow resnet50 time: 136.0ms (= 13597.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 159.7ms (= 15974.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 159.7ms / 136.0ms)

OneFlow resnet50 time: 87.2ms (= 8717.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 98.9ms (= 9890.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.13 (= 98.9ms / 87.2ms)

OneFlow resnet50 time: 60.4ms (= 12082.2ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 75.3ms (= 15062.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.25 (= 75.3ms / 60.4ms)

OneFlow resnet50 time: 53.1ms (= 10617.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 67.2ms (= 13438.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.27 (= 67.2ms / 53.1ms)

OneFlow resnet50 time: 49.9ms (= 9981.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.2ms (= 13641.6ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.37 (= 68.2ms / 49.9ms)

@BBuf BBuf merged commit 9e11124 into master Apr 13, 2022
@BBuf BBuf deleted the fix_masked_select_broadcast_bug branch April 13, 2022 06:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

masked_select代码未对齐
4 participants