Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite activation function #5465

Merged
merged 110 commits into from
Jul 29, 2021
Merged

Rewrite activation function #5465

merged 110 commits into from
Jul 29, 2021

Conversation

MARD1NO
Copy link
Contributor

@MARD1NO MARD1NO commented Jul 12, 2021

增加silu(swish),mish,softsign,selu 激活函数

进一步对齐torch实现,删除了torch.mish, tensor.mish方法

doctest

image

暴力单测

selu

image

silu

image

mish profile

新版本实现(static下):

I0713 13:36:23.541103 3522957 kernel.cpp:94] PROFILER::KERNEL::CUDA_MEMORY_BANDWIDTH op_name: Mish_1 elapsed(ms): 0.171648 memory_size(Byte): 50331648 bandwidth(GB/s): 273.088

原版本拼凑实现:

x * flow.math.tanh(
        flow.math.softplus(x, "softplus"), name="tanh"
    )


I0713 13:40:26.638630 3533810 kernel.cpp:94] PROFILER::KERNEL::CUDA_MEMORY_BANDWIDTH op_name: softplus elapsed(ms): 0.179008 memory_size(Byte): 50331648 bandwidth(GB/s): 261.86
I0713 13:40:26.638695 3533810 kernel.cpp:94] PROFILER::KERNEL::CUDA_MEMORY_BANDWIDTH op_name: tanh elapsed(ms): 0.137824 memory_size(Byte): 50331648 bandwidth(GB/s): 340.108
I0713 13:40:26.638793 3533810 kernel.cpp:94] PROFILER::KERNEL::CUDA_MEMORY_BANDWIDTH op_name: ElementWiseMul_1 elapsed(ms): 0.19056 memory_size(Byte): 75497472 bandwidth(GB/s): 368.978

相比提升2.95倍

api文档

image
image
image
image

silu profile

新版本实现(static下):

I0713 13:53:56.168175 3581556 kernel.cpp:94] PROFILER::KERNEL::CUDA_MEMORY_BANDWIDTH op_name: Silu_1 elapsed(ms): 0.161728 memory_size(Byte): 50331648 bandwidth(GB/s): 289.839

原版本拼凑实现:

    return x*flow.math.sigmoid(x)



I0713 14:01:01.258198 3602584 kernel.cpp:94] PROFILER::KERNEL::CUDA_MEMORY_BANDWIDTH op_name: Sigmoid_1 elapsed(ms): 0.44592 memory_size(Byte): 50331648 bandwidth(GB/s): 105.12
I0713 14:01:01.258385 3602584 kernel.cpp:94] PROFILER::KERNEL::CUDA_MEMORY_BANDWIDTH op_name: ElementWiseMul_2 elapsed(ms): 0.19056 memory_size(Byte): 75497472 bandwidth(GB/s): 368.978

autotest

image

@MARD1NO MARD1NO requested review from BBuf and simonJJJ July 13, 2021 08:08
@@ -0,0 +1,54 @@
/*
Copyright 2020 The OneFlow Authors. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

激活函数相关的反向都写到一个文件比如Activation.cpp,不然实在是太多文件了,而且这个PR里面反向的代码基本都一样,可以考虑重构一下,写一个Activation基类,顺便把以前所有的激活函数都基于这个类重构掉,这个不紧急,可以记TODO。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的下一个PR做这件事情

@MARD1NO MARD1NO requested a review from BBuf July 14, 2021 03:10
@github-actions
Copy link
Contributor

CI failed, removing label automerge

@oneflow-ci-bot oneflow-ci-bot removed their request for review July 29, 2021 03:02
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot July 29, 2021 04:56
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot July 29, 2021 06:07
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot July 29, 2021 07:14
@github-actions
Copy link
Contributor

CI failed, removing label automerge

@oneflow-ci-bot oneflow-ci-bot removed their request for review July 29, 2021 09:17
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot July 29, 2021 10:12
@github-actions
Copy link
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

PyTorch resnet50 time: 142.3ms (= 7116.0ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 126.5ms (= 6326.2ms / 50, input_shape=[16, 3, 224, 224], backward is enabled)
Relative speed: 1.12 (= 142.3ms / 126.5ms)

PyTorch resnet50 time: 84.7ms (= 4235.2ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 73.7ms (= 3687.4ms / 50, input_shape=[8, 3, 224, 224], backward is enabled)
Relative speed: 1.15 (= 84.7ms / 73.7ms)

PyTorch resnet50 time: 57.9ms (= 2894.4ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 49.0ms (= 2448.3ms / 50, input_shape=[4, 3, 224, 224], backward is enabled)
Relative speed: 1.18 (= 57.9ms / 49.0ms)

PyTorch resnet50 time: 49.2ms (= 2458.9ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 48.5ms (= 2423.2ms / 50, input_shape=[2, 3, 224, 224], backward is enabled)
Relative speed: 1.01 (= 49.2ms / 48.5ms)

PyTorch resnet50 time: 42.9ms (= 2145.3ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
OneFlow resnet50 time: 41.9ms (= 2097.2ms / 50, input_shape=[1, 3, 224, 224], backward is enabled)
Relative speed: 1.02 (= 42.9ms / 41.9ms)

@oneflow-ci-bot oneflow-ci-bot removed their request for review July 29, 2021 12:21
@oneflow-ci-bot oneflow-ci-bot merged commit 32ba800 into master Jul 29, 2021
@oneflow-ci-bot oneflow-ci-bot deleted the rewrite_activation_function branch July 29, 2021 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants