Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNN: Further optimization of Conv2D #22401

Merged
merged 1 commit into from
Aug 26, 2022
Merged

Conversation

zihaomu
Copy link
Member

@zihaomu zihaomu commented Aug 19, 2022

The optimization point of this PR:

  1. Bring and adapt the latest the Ficus OpConv.fx code.
  2. Fused Conv+Add+Activation. (Currently only for Conv2D, in the future FastConv can support Conv1D and Conv3D, we will also support both.)
  3. FastConv branch still reuses the weightsMat, and remove the fastWeights.
  4. Optimize the Winograd_F63: pack tile 12 and adjust the data pipeline. (About 1 ms speedup on M1 chip, testing model: ResNet50).

TODO List:

  • Find out why MobileNetv2 is slow. It is caused by the difference between OpenMP and GCD.

Performance Test on ARM (Appel M1 Chip, 4 threads)

Model Name Wthout Patch With Patch
ReseNet 50 26.8 ms 24.6 ms
MobileNetv2 5.43 ms GCD: 5.5 ms, OpenMP: 4.7 ms

Performance Test on ARM (Raspberry Pi 4, A72, 4 threads)

Model Name Without Patch With Patch NCNN's Benchmark
ReseNet 50 440.90 ms 400.6 ms (10% faster) 330 ms
MobileNetv2 51.64 ms 51.2 ms 71 ms

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@zihaomu zihaomu requested a review from vpisarev August 19, 2022 03:54
@zihaomu zihaomu changed the title DNN: Further optimization of Conv2D, fused Conv_Add_Activation. DNN: Further optimization of Conv2D Aug 19, 2022
@zihaomu zihaomu marked this pull request as ready for review August 23, 2022 11:09
@vpisarev vpisarev merged commit bb64db9 into opencv:4.x Aug 26, 2022
@asenyaev
Copy link
Contributor

Seems that merging this PR a build with coverage started to fail (log).

@zihaomu
Copy link
Member Author

zihaomu commented Aug 29, 2022

Seems that merging this PR a build with coverage started to fail (log).

Thanks, @asenyaev. I will submit a PR to fix it.

@asenyaev
Copy link
Contributor

@zihaomu, thank you!

@zihaomu zihaomu mentioned this pull request Aug 29, 2022
6 tasks
@alalek alalek mentioned this pull request Jan 8, 2023
@asmorkalov asmorkalov added this to the 4.7.0 milestone Jan 23, 2023
a-sajjad72 pushed a commit to a-sajjad72/opencv that referenced this pull request Mar 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants