SIMD support in Menoh #85

rajh619 · 2018-09-03T08:24:42Z

Hi
I was trying Menoh vgg16 example.
Does Menoh utilizes SIMD IS (like sse4, avx2 ) to speed up the inference ?
If not is there an option to utilize SIMD for CPU in Menoh ?

Thanks

okdshin · 2018-09-03T08:46:28Z

Hi. Thank you for your trying.
Menoh utilizes MKL-DNN and MKL-DNN supports SIMD optimization.
It is automatically enabled. You needn't to set any option to enable SIMD.
Could you tell us your usage? We perhaps can help you.
Does it run slower than you expected?

rajh619 · 2018-09-04T08:26:10Z

Hi ,
I was comparing the performance with TVM compiler .
Actually i tested a Resnet onnx model in both Menoh and TVM .
I observed the execution time in NNVM(cpu avx2) is less when compared to Menoh( i dono which Ins.Set it uses for cpu) .
I have few questions :

What is the execution model of menoh ? I can find only api documentation . Is there any Design/fucntional document to understand how Menoh works ?
Is there any options available for finetuning/optimizing the model execution .(For ex in TVM ,there are many options to optimise graphs , choose cpu or cpu-avx2 instruction set for execution etc..) .Is there any such kind for Menoh?

Thanks.

okdshin · 2018-09-04T13:30:10Z

Thank you for your important questions. I'll answer for them.

(1) Currently Menoh is an experimental framework and its implementations are incomplete. Of course, there is a design document about Menoh but sorry it is not included in the documents in this repository yet.
So let me explain the design briefly here.

Menoh can be split into three parts: graph manipulation part, construction part and execution part.

In graph manipulation part, users can modify graphs loaded from ONNX (or something, maybe): deleting nodes, adding nodes and parameters, merging different models or more aggressive operations.

In construction part, multiple backends (which has specialized mechanisms to execute some operators faster. Currently there is only MKLDNN) parse partial graphs and generate procedure list. We are now planning to make backends customizable in various ways. So NNVM/TVM backend can be here.

In execution part, users can execute procedures and take outputs.

Also, we are seeking the better design to utilize DNN models outside laboratories.

(2) Currently Menoh has only very simple computation graph optimizations (e.g. Trimming useless nodes). However, now we are planning to introduce some methods to optimize the graph into the graph manipulation part, cooperating with Chainer development team.

rajh619 · 2018-09-06T05:11:49Z

Hi ,

Thankyou for your explanation.
I understand ,it is in experimental phase .

Based on your explanation , I could find similar graph optimisation techniques in NNVM and accelerated execution in TVM (based on muliple backends like CPU's, Opencl/Cuda GPU's, FPGA etc).

I would like to know ,how different is Menoh from NNVM/TVM ?

Thanks

okdshin · 2018-09-06T14:07:46Z

Thank you for your another important question.
NNVM/TVM is good framework and it has partially common goal with Menoh. However they also have differnt perspective about DNN inference.

Let me explain the most different part between NNVM/TVM and Menoh.

NNVM/TVM compiles trained models to dynamic libraries then applications load them and execute the operations. Model construction and execution are splitted different part. And execution part is a blackbox to users so users can not modify compiled models.

On the other hand, Menoh doesn't compile but interprets computation graphs and execute without break. It is generally slower, but simpler design than NNVM/TVM's and users can modify anywhere smoothly with coding. When we need the speed, we can also wrap NNVM/TVM by Menoh and utilize its way with Menoh C API and many language bindings.
In other words, Menoh thinks it is important to cope with even customizability and usability but not only speed.

rajh619 · 2018-09-07T05:02:24Z

Thank you for your detailed information !

rajh619 · 2018-09-19T08:49:15Z

Hi @okdshin ,
Am trying to understand the motive behind Menoh development .
Why to use Menoh if NNVM can do all the operations other than compute graphs customization ? Could you explain me an example use case ,where we modify/customize the pretrained ONNX model on the go while execution .

Thanks

okdshin · 2018-09-19T13:46:24Z

Hi @rajh619. Thanks question.
Honestly speaking, modifying trained model previously, graph manipulation is not needed much. However, it is troublesome for users who want merely to use trained models distributed by another users.
In addition, now I am developing model construction fallback system; when the first backend (e.g. ARMNN) failed to interpret an operator, the second backend (e.g. Generic which is naive C++ inplementation) trys to interpret. In the result, the model which utilizes multiple backends is constructed.
Current ARMNN backend does not have such a fallback system, but we plans to integrate it. That fallback system will enable Menoh to interpret operators unsupported by some backend and also operators customized by users. That feature is not available by using ARMNN only.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SIMD support in Menoh #85

SIMD support in Menoh #85

rajh619 commented Sep 3, 2018

okdshin commented Sep 3, 2018

rajh619 commented Sep 4, 2018 •

edited

Loading

okdshin commented Sep 4, 2018 •

edited

Loading

rajh619 commented Sep 6, 2018

okdshin commented Sep 6, 2018

rajh619 commented Sep 7, 2018

rajh619 commented Sep 19, 2018

okdshin commented Sep 19, 2018

SIMD support in Menoh #85

SIMD support in Menoh #85

Comments

rajh619 commented Sep 3, 2018

okdshin commented Sep 3, 2018

rajh619 commented Sep 4, 2018 • edited Loading

okdshin commented Sep 4, 2018 • edited Loading

rajh619 commented Sep 6, 2018

okdshin commented Sep 6, 2018

rajh619 commented Sep 7, 2018

rajh619 commented Sep 19, 2018

okdshin commented Sep 19, 2018

rajh619 commented Sep 4, 2018 •

edited

Loading

okdshin commented Sep 4, 2018 •

edited

Loading