-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Halide backend support for deep learning layers #1186
Conversation
71cd4b3
to
3ee3f41
Compare
modules/dnn/perf/perf_halide_net.cpp
Outdated
{ \ | ||
net.forward(net.getLayerId(outputLayer)); \ | ||
} \ | ||
SANITY_CHECK_NOTHING(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use "static void" function instead. It is very hard to debug multiline macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Replaced to static function.
modules/dnn/CMakeLists.txt
Outdated
set(DNN_LIBRARIES | ||
${DNN_LIBRARIES} | ||
${HALIDE_LIBRARIES} | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list(APPEND DNN_INCLUDE_DIRS ${HALIDE_INCLUDE_DIRS})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eliminated CMakeLists changes at all.
modules/dnn/perf/perf_halide_net.cpp
Outdated
|
||
inline std::string getOpenCVExtraDir() | ||
{ | ||
return "/home/dkurtaev/opencv_extra/testdata/"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#1162 + findDataFile()
function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified modules/dnn/perf/perf_main.cpp for it.
Measured on Intel® Core™ i7-6700K CPU @ 4.00GHz x 8. | ||
|
||
<html> | ||
<table> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
using namespace cv; | ||
using namespace dnn; | ||
|
||
void loadNet(const std::string& weights, const std::string& proto, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use "static" or "anonymous namespace" for internal functions.
BTW, Where is #include
directive in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's working somehow. Should I add it for prevent fails in different build cases?
226db43
to
e9b2d2f
Compare
1070e98
to
17e9f46
Compare
*/ | ||
enum Backend | ||
{ | ||
DEFAULT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename them to DNN_BACKEND_DEFAULT, DNN_BACKEND_HALIDE, to make it more self-explanatory
*/ | ||
enum HalideTarget | ||
{ | ||
HALIDE_CPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, let's use DNN_TARGET_CPU, DNN_TARGET_GPU etc. I would not use the word "Halide" here, because other backends, like OpenVX or even the default one, can be potentially customized to use CPU, GPU or other device to run the networks
* it helps prevent some memory management issues (if something wrong, | ||
* Halide tests will be failed). | ||
*/ | ||
virtual std::vector<void*> initHalide(const std::vector<void*> &inputs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of very low-level void*
I would use some abstract class HalideFunc
(that can encapsulate void*
) with a smart pointer to it. It's comparatively low overhead, but it would be much more safe from memory management point of view. That is, what about changing std::vector<void*>
to std::vector<Ptr<HalideFunc>>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introduced BackendNode
and BackendWrapper
whose derivatives store explicit Halide::Func
and Halide::Buffer
correspondingly. Moreover, it helps remove Halide::Func
members from layers as it was before.
2c28d87
to
aedce13
Compare
fc18830
to
2e94270
Compare
ok, let's merge it in while it's fresh! 👍 |
This PR is rely on opencv/opencv#8794 . Need to review/merge them too. |
Implemented some layers using Halide language backend:
Supported CPU only, float32 computations. No ahead-of-time compilation.
Added tutorial with installation guide and sample, tests for some layers and networks, performance tests.
Generalized manual scheduling of 5 networks into the layer-wise automatic scheduling.
Intel® Core™ i7-6700K CPU @ 4.00GHz x 8
For testing purpose have taken GoogLeNet network. L1 diff is 1.26789e-09. Efficiency is 33ms using Halide (automatic scheduling) against 34.89ms using MKL (x1.05).
TODO list:
x
variable to factor 4 (from 8 or 16). All right in case of no scheduling).supportBackend
and layer configurations (axis, etc.)