Building MXNet with Intel MKL-DNN will gain better performance when using Intel Xeon CPUs for training and inference. The improvement of performance can be seen in this page. Below are instructions for linux, MacOS and Windows platform.
sudo apt-get update
sudo apt-get install -y build-essential git
sudo apt-get install -y libopenblas-dev liblapack-dev
sudo apt-get install -y libopencv-dev
sudo apt-get install -y graphviz
git clone --recursive https://github.com/apache/incubator-mxnet.git
cd incubator-mxnet
make -j $(nproc) USE_OPENCV=1 USE_MKLDNN=1 USE_BLAS=mkl USE_INTEL_PATH=/opt/intel
If you don't have full MKL library installed, you can use OpenBLAS by setting USE_BLAS=openblas
.
Install the dependencies, required for MXNet, with the following commands:
- Homebrew
- gcc (clang in macOS does not support OpenMP)
- OpenCV (for computer vision operations)
# Paste this command in Mac terminal to install Homebrew
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
# install dependency
brew update
brew install pkg-config
brew install graphviz
brew tap homebrew/core
brew install opencv
brew tap homebrew/versions
brew install gcc49
brew link gcc49 #gcc-5 and gcc-7 also work
git clone --recursive https://github.com/apache/incubator-mxnet.git
cd incubator-mxnet
If you want to enable OpenMP for better performance, you should modify the Makefile in MXNet root dictionary:
Add CFLAGS '-fopenmp' for Darwin.
ifeq ($(USE_OPENMP), 1)
# ifneq ($(UNAME_S), Darwin)
CFLAGS += -fopenmp
# endif
endif
make -j $(sysctl -n hw.ncpu) CC=gcc-4.9 CXX=g++-4.9 USE_OPENCV=0 USE_OPENMP=1 USE_MKLDNN=1 USE_BLAS=apple USE_PROFILER=1
Note: Temporarily disable OPENCV.
We recommend to build and install MXNet yourself using Microsoft Visual Studio 2015, or you can also try experimentally the latest Microsoft Visual Studio 2017.
Visual Studio 2015
To build and install MXNet yourself, you need the following dependencies. Install the required dependencies:
- If Microsoft Visual Studio 2015 is not already installed, download and install it. You can download and install the free community edition.
- Download and Install CMake 3 if it is not already installed.
- Download and install OpenCV 3.
- Unzip the OpenCV package.
- Set the environment variable
OpenCV_DIR
to point to theOpenCV build directory
(C:\opencv\build\x64\vc14
for example). Also, you need to add the OpenCV bin directory (C:\opencv\build\x64\vc14\bin
for example) to thePATH
variable. - If you have Intel Math Kernel Library (MKL) installed, set
MKL_ROOT
to point toMKL
directory that contains theinclude
andlib
. If you want to use MKL blas, you should set-DUSE_BLAS=mkl
when cmake. Typically, you can find the directory inC:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2018\windows\mkl
. - If you don't have the Intel Math Kernel Library (MKL) installed, download and install OpenBLAS. Note that you should also download ```mingw64.dll.zip`` along with openBLAS and add them to PATH.
- Set the environment variable
OpenBLAS_HOME
to point to theOpenBLAS
directory that contains theinclude
andlib
directories. Typically, you can find the directory inC:\Program files (x86)\OpenBLAS\
.
After you have installed all of the required dependencies, build the MXNet source code:
- Download the MXNet source code from GitHub. Don't forget to pull the submodules:
git clone --recursive https://github.com/apache/incubator-mxnet.git
-
Copy file
3rdparty/mkldnn/config_template.vcxproj
to incubator-mxnet root. -
Start a Visual Studio command prompt.
-
Use CMake 3 to create a Visual Studio solution in
./build
or some other directory. Make sure to specify the architecture in the CMake 3 command:
mkdir build
cd build
cmake -G "Visual Studio 14 Win64" .. -DUSE_CUDA=0 -DUSE_CUDNN=0 -DUSE_NVRTC=0 -DUSE_OPENCV=1 -DUSE_OPENMP=1 -DUSE_PROFILER=1 -DUSE_BLAS=open -DUSE_LAPACK=1 -DUSE_DIST_KVSTORE=0 -DCUDA_ARCH_NAME=All -DUSE_MKLDNN=1 -DCMAKE_BUILD_TYPE=Release
-
In Visual Studio, open the solution file,
.sln
, and compile it. These commands produce a library calledlibmxnet.dll
in the./build/Release/
or./build/Debug
folder. Alsolibmkldnn.dll
with be in the./build/3rdparty/mkldnn/src/Release/
-
Make sure that all the dll files used above(such as
libmkldnn.dll
,libmklml.dll
,libiomp5.dll
,libopenblas.dll
, etc) are added to the system PATH. For convinence, you can put all of them to\windows\system32
. Or you will come acrossNot Found Dependencies
when loading mxnet.
Visual Studio 2017
To build and install MXNet yourself using Microsoft Visual Studio 2017, you need the following dependencies. Install the required dependencies:
- If Microsoft Visual Studio 2017 is not already installed, download and install it. You can download and install the free community edition.
- Download and install CMake 3 if it is not already installed.
- Download and install OpenCV.
- Unzip the OpenCV package.
- Set the environment variable
OpenCV_DIR
to point to theOpenCV build directory
(e.g.,OpenCV_DIR = C:\utils\opencv\build
). - If you don’t have the Intel Math Kernel Library (MKL) installed, download and install OpenBlas.
- Set the environment variable
OpenBLAS_HOME
to point to theOpenBLAS
directory that contains theinclude
andlib
directories (e.g.,OpenBLAS_HOME = C:\utils\OpenBLAS
).
After you have installed all of the required dependencies, build the MXNet source code:
-
Start
cmd
in windows. -
Download the MXNet source code from GitHub by using following command:
cd C:\
git clone --recursive https://github.com/apache/incubator-mxnet.git
-
Copy file
3rdparty/mkldnn/config_template.vcxproj
to incubator-mxnet root. -
Follow this link to modify
Individual components
, and checkVC++ 2017 version 15.4 v14.11 toolset
, and clickModify
. -
Change the version of the Visual studio 2017 to v14.11 using the following command (by default the VS2017 is installed in the following path):
"C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvars64.bat" -vcvars_ver=14.11
- Create a build dir using the following command and go to the directory, for example:
mkdir C:\build
cd C:\build
- CMake the MXNet source code by using following command:
cmake -G "Visual Studio 15 2017 Win64" .. -T host=x64 -DUSE_CUDA=0 -DUSE_CUDNN=0 -DUSE_NVRTC=0 -DUSE_OPENCV=1 -DUSE_OPENMP=1 -DUSE_PROFILER=1 -DUSE_BLAS=open -DUSE_LAPACK=1 -DUSE_DIST_KVSTORE=0 -DCUDA_ARCH_NAME=All -DUSE_MKLDNN=1 -DCMAKE_BUILD_TYPE=Release
- After the CMake successfully completed, compile the the MXNet source code by using following command:
msbuild mxnet.sln /p:Configuration=Release;Platform=x64 /maxcpucount
- Make sure that all the dll files used above(such as
libmkldnn.dll
,libmklml.dll
,libiomp5.dll
,libopenblas.dll
, etc) are added to the system PATH. For convinence, you can put all of them to\windows\system32
. Or you will come acrossNot Found Dependencies
when loading mxnet.
cd python
sudo python setup.py install
python -c "import mxnet as mx;print((mx.nd.ones((2, 3))*2).asnumpy());"
Expected Output:
[[ 2. 2. 2.]
[ 2. 2. 2.]]
After MXNet is installed, you can verify if MKL-DNN backend works well with a single Convolution layer.
import mxnet as mx
import numpy as np
num_filter = 32
kernel = (3, 3)
pad = (1, 1)
shape = (32, 32, 256, 256)
x = mx.sym.Variable('x')
w = mx.sym.Variable('w')
y = mx.sym.Convolution(data=x, weight=w, num_filter=num_filter, kernel=kernel, no_bias=True, pad=pad)
exe = y.simple_bind(mx.cpu(), x=shape)
exe.arg_arrays[0][:] = np.random.normal(size=exe.arg_arrays[0].shape)
exe.arg_arrays[1][:] = np.random.normal(size=exe.arg_arrays[1].shape)
exe.forward(is_train=False)
o = exe.outputs[0]
t = o.asnumpy()
You can open the MKLDNN_VERBOSE
flag by setting environment variable:
export MKLDNN_VERBOSE=1
Then by running above code snippet, you probably will get the following output message which means convolution
and reorder
primitive from MKL-DNN are called. Layout information and primitive execution performance are also demonstrated in the log message.
mkldnn_verbose,exec,reorder,jit:uni,undef,in:f32_nchw out:f32_nChw16c,num:1,32x32x256x256,6.47681
mkldnn_verbose,exec,reorder,jit:uni,undef,in:f32_oihw out:f32_OIhw16i16o,num:1,32x32x3x3,0.0429688
mkldnn_verbose,exec,convolution,jit:avx512_common,forward_inference,fsrc:nChw16c fwei:OIhw16i16o fbia:undef fdst:nChw16c,alg:convolution_direct,mb32_g1ic32oc32_ih256oh256kh3sh1dh0ph1_iw256ow256kw3sw1dw0pw1,9.98193
mkldnn_verbose,exec,reorder,jit:uni,undef,in:f32_oihw out:f32_OIhw16i16o,num:1,32x32x3x3,0.0510254
mkldnn_verbose,exec,reorder,jit:uni,undef,in:f32_nChw16c out:f32_nchw,num:1,32x32x256x256,20.4819
To make it convenient for customers, Intel introduced a new license called Intel® Simplified license that allows to redistribute not only dynamic libraries but also headers, examples and static libraries.
Installing and enabling the full MKL installation enables MKL support for all operators under the linalg namespace.
-
Download and install the latest full MKL version following instructions on the intel website.
-
Run
make -j ${nproc} USE_BLAS=mkl
-
Navigate into the python directory
-
Run
sudo python setup.py install
After MXNet is installed, you can verify if MKL BLAS works well with a single dot layer.
import mxnet as mx
import numpy as np
shape_x = (1, 10, 8)
shape_w = (1, 12, 8)
x_npy = np.random.normal(0, 1, shape_x)
w_npy = np.random.normal(0, 1, shape_w)
x = mx.sym.Variable('x')
w = mx.sym.Variable('w')
y = mx.sym.batch_dot(x, w, transpose_b=True)
exe = y.simple_bind(mx.cpu(), x=x_npy.shape, w=w_npy.shape)
exe.forward(is_train=False)
o = exe.outputs[0]
t = o.asnumpy()
You can open the MKL_VERBOSE
flag by setting environment variable:
export MKL_VERBOSE=1
Then by running above code snippet, you probably will get the following output message which means SGEMM
primitive from MKL are called. Layout information and primitive execution performance are also demonstrated in the log message.
Numpy + Intel(R) MKL: THREADING LAYER: (null)
Numpy + Intel(R) MKL: setting Intel(R) MKL to use INTEL OpenMP runtime
Numpy + Intel(R) MKL: preloading libiomp5.so runtime
MKL_VERBOSE Intel(R) MKL 2018.0 Update 1 Product build 20171007 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) enabled processors, Lnx 2.40GHz lp64 intel_thread NMICDev:0
MKL_VERBOSE SGEMM(T,N,12,10,8,0x7f7f927b1378,0x1bc2140,8,0x1ba8040,8,0x7f7f927b1380,0x7f7f7400a280,12) 8.93ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:40 WDiv:HOST:+0.000
-
For questions or support specific to MKL, visit the Intel MKL
-
For questions or support specific to MKL, visit the Intel MKLDNN
-
If you find bugs, please open an issue on GitHub for MXNet with MKL or MXNet with MKLDNN