DNN: Try to be compatible with win32 #22454

zihaomu · 2022-08-31T06:10:01Z

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

force_builders=Linux32,Win32

zihaomu · 2022-08-31T06:15:58Z

Hi @alalek, can you test if this patch can fix the issue Win32?

alalek · 2022-08-31T07:44:34Z

modules/dnn/src/layers/fast_convolution/fast_convolution.hpp

@@ -21,7 +21,7 @@ enum { FAST_VEC_NLANES=4 };
 #define CONV_MR 4
 #define CONV_NR 24

-#ifdef CV_AVX2
+#if CV_AVX2


ISA-specific checks should be avoided in general (as API is called "universal intrinsics"), they could be used for fine-tuning only.
CV_SIMD_WIDTH should be used instead for detection of SIMD128 and others.

Thanks for the reminder, what macros should we use to distinguish function calls? How about #if __AVX2__?

According to comment // SIMD 128 in the else branch you don't want to handle AVX2 here only (because code is a priori broken for other SIMD256 ISAs).
I believe you want to check for SIMD256 / 128.
So you should do that through CV_SIMD_WIDTH check in that case.

I got your point now. For now, we have AVX2 branch, NEON branch, and Universal intrinsics (SIMD 128).
And if we just set FAST_VEC_NLANES to 8 when SIMD256 is true. We need add the then Universal intrinsics (SIMD 256) implementation to prevent code errors.

I will reconsider how to better support more platforms.

Hi @alalek, the current AVX implementation is compatible with AVX and AVX2. How can the implementation of a function (like convBlock_AVX2) exist in two namespaces (opt_AVX and opt_AVX2) at the same time?

Updata: The current workaround is that AVX2 will be computed at AVX2 branch, and AVX to the SIMD256 branch.

Thank you for update!
I would propose to keep original one-line "quick fix" to unlock win32/linux32 builds and then move refactoring into separate PR (as there is still open questions).

How can the implementation of a function (like convBlock_AVX2) exist in two namespaces (opt_AVX and opt_AVX2) at the same time?

This should be implemented with "runtime dispatching" through .simd.hpp files (finally we should not have SIMD code in .cpp files). See also this wiki page: https://github.com/opencv/opencv/wiki/CPU-optimizations-build-options

alalek

Thank you 👍

zihaomu requested a review from alalek August 31, 2022 06:10

zihaomu added bug category: dnn labels Aug 31, 2022

alalek reviewed Aug 31, 2022

View reviewed changes

zihaomu force-pushed the bug_fix_22450 branch 7 times, most recently from 703c147 to 0138615 Compare September 2, 2022 01:57

fix bug 22450

b69b1ea

zihaomu force-pushed the bug_fix_22450 branch from 0138615 to b69b1ea Compare September 2, 2022 08:30

alalek approved these changes Sep 2, 2022

View reviewed changes

opencv-pushbot merged commit 4159842 into opencv:4.x Sep 2, 2022

alalek mentioned this pull request Jan 8, 2023

(5.x) Merge 4.x #23113

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DNN: Try to be compatible with win32 #22454

DNN: Try to be compatible with win32 #22454

Uh oh!

zihaomu commented Aug 31, 2022 •

edited by alalek

Loading

Uh oh!

zihaomu commented Aug 31, 2022

Uh oh!

alalek Aug 31, 2022

Uh oh!

zihaomu Aug 31, 2022

Uh oh!

alalek Aug 31, 2022

Uh oh!

zihaomu Sep 1, 2022

Uh oh!

zihaomu Sep 1, 2022

Uh oh!

zihaomu Sep 1, 2022

Uh oh!

zihaomu Sep 2, 2022

Uh oh!

alalek Sep 2, 2022

Uh oh!

zihaomu Sep 2, 2022

Uh oh!

alalek left a comment

Uh oh!

Uh oh!

Uh oh!

DNN: Try to be compatible with win32 #22454

DNN: Try to be compatible with win32 #22454

Uh oh!

Conversation

zihaomu commented Aug 31, 2022 • edited by alalek Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

zihaomu commented Aug 31, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zihaomu commented Aug 31, 2022 •

edited by alalek

Loading