-
Notifications
You must be signed in to change notification settings - Fork 5.9k
Initial implementation of multiplication operator for MKLDNN #9949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial implementation of multiplication operator for MKLDNN #9949
Conversation
97843f1 to
f24fec2
Compare
f24fec2 to
b768319
Compare
b768319 to
6e7b883
Compare
luotao1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the implement of mul_mkldnn_op is similar to fc_mkldnn_op, and we can set bias_attr=None in fc_mkldnn_op to obtain mul_mkldnn_op. Thus, why mul_mkldnn_op should be implemented again?
| "bias_attr": bias_attr | ||
| "x_num_col_dims": num_flatten_dims, | ||
| "y_num_col_dims": 1, | ||
| "use_mkldnn": use_mkldnn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why use mul_mkldnn_op to replace fc_mkldnn_op here?
In my mind, the performance of fc_mkldnn_op is better than mul_mkldnn_op + elementwise_add_op in fluid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I must agree with you, but the fully connected layer of mkldnn has a little problem with the input of matrices, What means that in the case when the input of layer is greater than one the mkldnn operator doesn't work correctly. Therefore we hit upon a good idea to remove this pitfall and withdraw the fully connected operator. Moreover this simple solution give us an opportunity to run the new neural network models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean that you will not use fc_mkldnn_op in fluid? Maybe you can add a comment in nn.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With a heavy heart, I must say, yes, this operator is removed but for a while. I guess that in the future we will come back to the fully connected layer for a mkldnn.
| auto desc = tf_desc<Type>(args...); | ||
| return tf_pd<Type>(desc, e, p); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are line 36-61 only used in mul_mkldnn_op? If these lines are not used in other ops, they should be moved into mul_mkldnn_op internel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this moment, yes, but this code may be uses by the all mkldnn operators.
|
@tensor-tang Do you have any other comments about this PR? |
| using tf_desc = typename Type::desc; | ||
|
|
||
| template <class Type> | ||
| using tf_pd = typename Type::primitive_desc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small question, why named tf_pd here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tf = type_function.
luotao1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks very much!
No description provided.