Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize conv performance #3477

Merged
merged 2 commits into from
Aug 13, 2020
Merged

Optimize conv performance #3477

merged 2 commits into from
Aug 13, 2020

Conversation

liujuncheng
Copy link
Collaborator

优化Conv FP16 性能

  • 将pseudo_half而不是true_half作为默认值
  • 根据输入输出的dtype决定是否启用CUDNN_TENSOR_OP_MATH,而不是根据compute_type

在2080ti上测试resnet50,单卡batch_size为64,吞吐率如下

  优化前 优化后
FP32+NCHW 284 280
FP16+NCHW 443 505
FP16+NHWC 336 615

@leaves-zwx
Copy link
Contributor

这里的优化效果的来源是 conv2d 的计算从 true_half 变为 pseudo_half 吗?

CudnnConvDesc(const DataType& data_type, const ShapeView& in_blob_shape,
const user_op::UserOpConfWrapper& conv_conf);
CudnnConvDesc(const DataType compute_type, const DataType data_type,
const ShapeView& in_blob_shape, const PbMessage& conv_conf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个接受 const PbMessage& conv_conf 参数的构造函数随着旧的 conv op 代码的删除已经没地方使用了,是否可以删除?
也可以另起一个 pr 删除

@@ -200,7 +201,7 @@ CudnnConvArgs::CudnnConvArgs(const PbMessage& conv_conf, DataType x_data_type,
: xdesc(x_data_type, x_shape, data_format),
ydesc(y_data_type, y_shape, data_format),
wdesc(w_data_type, w_shape, data_format),
cdesc(GetConvDescDataType(x_data_type, enable_pseudo_half), x_shape, conv_conf),
cdesc(GetConvDescDataType(x_data_type, enable_pseudo_half), x_data_type, x_shape, conv_conf),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同 CudnnConvDesc,这里接受 const PbMessage& conv_conf 为参数的 CudnnConvArgs 的构造函数好像也可以删除了。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同 CudnnConvDesc,这里接受 const PbMessage& conv_conf 为参数的 CudnnConvArgs 的构造函数好像也可以删除了。

这个PR属于fix,和重构不放在一起吧

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,那等这个 pr merge 后,我再提一个新的 pr 把过期作废的代码删除一下。

@liujuncheng
Copy link
Collaborator Author

这里的优化效果的来源是 conv2d 的计算从 true_half 变为 pseudo_half 吗?

提到的两点都是

@leaves-zwx
Copy link
Contributor

这里的优化效果的来源是 conv2d 的计算从 true_half 变为 pseudo_half 吗?

提到的两点都是

大概明白了,是之前的用法有误。cudnn conv math type 的设置是要根据输入输出的 data_type 来确定的,而不是根据 cudnnConvolutionDescriptor_t 这个结构里面的设置的 data_type 来确定的。

  if (GetCudnnDataType(data_type) == CUDNN_DATA_HALF) {
    OF_CUDNN_CHECK(cudnnSetConvolutionMathType(val_, CUDNN_TENSOR_OP_MATH));
  }

那 fp16 + nchw 和 fp16 + nhwc 这两种使用场景下,都是 pseudo_half 比 true_half 更快吗?那 true_half 的使用场景应该是什么呢?

@jackalcooper jackalcooper added this to the 0.1.9 milestone Aug 13, 2020
@liujuncheng liujuncheng merged commit 5fc44b2 into master Aug 13, 2020
@liujuncheng liujuncheng deleted the dev_optimize_half_conv branch August 13, 2020 10:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants