Skip to content

Conversation

@panyx0718
Copy link
Contributor

@panyx0718 panyx0718 commented Mar 25, 2018

layer_norm forward and backward overall speed up 3x ~ 4x
transfomer on a single device step time
reduces from 0.157 to 0.125

the precommit also automatically formatted some codes.

    transfomer on a single device step time
    reduces from 0.157 to 0.125
@panyx0718 panyx0718 requested a review from chengduoZH March 25, 2018 10:28
Copy link
Contributor

@chengduoZH chengduoZH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is great!!
The functions in Eigen are too extensive and are very slow in many places.


#ifdef PADDLE_WITH_CUDA
template <typename T>
class RowwiseMean2D<platform::CUDADeviceContext, T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better to write this function in math_function.

template <typename T>
class ColwiseSum2D<platform::CUDADeviceContext, T> {
public:
ColwiseSum2D(int left, int right, const platform::DeviceContext& dev_ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ColwiseSum is used in lstm_op, gru_op, sequence_expand_op and lstmp_op, maybe those ops' performance can be improved too.

Copy link
Contributor

@chengduoZH chengduoZH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR can be merged first and fixing the comments in next PR.

@panyx0718 panyx0718 merged commit 3941c2d into PaddlePaddle:develop Mar 26, 2018
blacksheep-Aristotle pushed a commit to blacksheep-Aristotle/Paddle that referenced this pull request Nov 22, 2024
* Fix exitcode bug

* Fix `track_case_status` func match bug

* Fix return code

* Fix print_info func with exit -6

* set output format of fail tests
modify verification check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants