Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize ComputeEntropyGpu with CUB #3930

Merged
merged 30 commits into from
Dec 17, 2020
Merged

Conversation

WangTuoxyty
Copy link
Contributor

@WangTuoxyty WangTuoxyty commented Nov 27, 2020

No description provided.

@oneflow-ci-bot oneflow-ci-bot removed their request for review November 30, 2020 03:09
@WangTuoxyty
Copy link
Contributor Author

WangTuoxyty commented Nov 30, 2020

###性能测试
| crossEntropyGpu kernel 耗时测试 |
| ------------------------------------------------------------ | ------ | -------- | ---------- |
| 数据类型 | 优化前[us] | 优化后[us] | 提升倍数 |
| float | 86.3328 | 3.8944 | 22.168447 |
| double | 92.2208 | 4.8736 | 18.9225213394616 |

@oneflow-ci-bot oneflow-ci-bot removed their request for review December 15, 2020 12:01
@oneflow-ci-bot oneflow-ci-bot removed their request for review December 15, 2020 14:50
@WangTuoxyty WangTuoxyty requested review from oneflow-ci-bot and removed request for oneflow-ci-bot December 16, 2020 00:54
@oneflow-ci-bot oneflow-ci-bot removed their request for review December 16, 2020 00:57
@oneflow-ci-bot oneflow-ci-bot removed their request for review December 16, 2020 01:04
@oneflow-ci-bot oneflow-ci-bot removed their request for review December 16, 2020 01:15
@oneflow-ci-bot oneflow-ci-bot removed their request for review December 16, 2020 05:49
@oneflow-ci-bot oneflow-ci-bot removed their request for review December 16, 2020 09:44
@oneflow-ci-bot oneflow-ci-bot removed their request for review December 17, 2020 13:08
@liujuncheng liujuncheng merged commit 23c2924 into master Dec 17, 2020
@liujuncheng liujuncheng deleted the wangtuo/computeEntropyGpu branch December 17, 2020 13:11
liujuncheng pushed a commit that referenced this pull request Jun 3, 2021
* 优化 rmsprop optimizer.

* lars optimizer

* bug fix

* reimplement the function of rmsprop.

* delete folder CMakeFiles.

* rewrite computeCrossEntropyGpu with cub.

* adjust the precise problem of crossEntropy fp16.

* fix the unit test script.

* adjust format of code

* fix bug according to review advices.

* adjust format of code

* adjust the code of unit test script.

* fix the format of unit test script.

* adjust the format for CI test.

Co-authored-by: guo ran <360112263@qq.com>
Former-commit-id: 23c2924
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants