Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-679] Refactor handling BLAS libraries with cmake #11148

Closed
wants to merge 78 commits into from

Conversation

lebeg
Copy link
Contributor

@lebeg lebeg commented Jun 5, 2018

Description

This is fixing a number of issues with handling BLAS libraries with cmake:

[MXNET-115] USE_LAPACK is forced on all platforms
#8702 [DISCUSSION] Should we deprecate Makefile and only use CMake?
#8532 mxnet-mkl (v0.12.0) crash when using (conda-installed) numpy with MKL // (general MKL handling in comments)
#10175 MXNet MKLDNN build dependency/flow discussion
#9868 MKL and CMake
#9105 libmxnet.so load path error
#8729 Build amalgamation using a docker // (LAPACK with cross compilation in the comments)
#7852 Trouble installing MXNet on Raspberry Pi 3

Checklist

Essentials

  • The PR title start starts with [MXNET-679]
  • Changes are complete
  • Code is well-documented
  • To the my best knowledge, examples are not affected by this change

Changes

  • Moved all BLAS library handling to a separate module
  • LAPACK is handled properly and separately for different BLAS libs
  • Added LAPACK functionality when using Accelerate
  • Moved MKLML handling to a separate module
  • MKLML will be downloaded only if needed (previously always)
  • Enabled MKLML for mac
  • Added documentation for BLAS option in the cmake file
  • Made MKLML / MKLDNN separation that finally makes sense (see MKL and CMake #9868)
  • Defined and documented the recommended order for checking availability of BLAS libraries (from most recommended to usual)
  • Disabled MKL, MKLML, MKLDNN for not desktop platforms and cross compilation
  • Switched on LAPACK for arm and android builds
  • Standardised android arm builds
  • Upgraded OpenBLAS version for cross-compilation to v0.3.2
  • Used standard random generators instead of rand_r for openmp threads used in rnn
  • Pinned dockcross images for android_armv7 android_arm64 to custom build and stable ones
  • Upgraded MKLML to v0.17-rc

Comments

  • USE_MKLML_MKL was renamed to USE_MKLML due to common sense

@lebeg lebeg requested a review from szha as a code owner June 5, 2018 09:50
endif()
endif()

return()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a default case?

@marcoabreu
Copy link
Contributor

How is this related to #11094?

@lebeg
Copy link
Contributor Author

lebeg commented Jun 7, 2018

#11094 does not fix the problem that there is no liblapack library for OpenBLAS as well as for no library except Atlas. The logic to determine whether lapack functionality is available is broken in disregards of the USE_LAPACK flag being used. This PR is an attempt to fix and improve the situation and automatically detect lapack for each blas library used in the right way.

@marcoabreu
Copy link
Contributor

Okay, please resolve conflicts so I can continue to review

@lebeg lebeg changed the title [WIP] [MXNET-115] USE_LAPACK is forced on all platforms [WIP] Refactored handling BLAS libraries with cmake Jul 16, 2018
@lebeg lebeg changed the title [WIP] Refactored handling BLAS libraries with cmake [MXNET-679] Refactor handling BLAS libraries with cmake Jul 16, 2018
@aaronmarkham
Copy link
Contributor

Does this perhaps fix #11769?

@lebeg
Copy link
Contributor Author

lebeg commented Jul 16, 2018

Yes, I think this should help.

Copy link
Contributor

@aaronmarkham aaronmarkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really cool stuff. Seems like this will be a much smoother experience. I had some minor nits on capitalization and consistency in the libraries' names... and some questions/suggestions.

Will we be moving / copying some of info in the comments of the CMakeLists.txt file to web pages in/around the install pages?

CMakeLists.txt Outdated
# * MKL (MKL, MKLML, MKLDNN)
# * Apple Accelerate
#
# The default order of choice for the libraries if found follows the path from probably the most
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably? 😆

CMakeLists.txt Outdated
# 1. MKLDNN (requires and checks for MKL or MKLML)
# 2. MKL
# 3. MKLML (downloaded)
# 4. Accelerate (on mac)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

triggered by apple correct? Maybe worth putting in each one's param value, so when reading this you're associating the correct values...

CMakeLists.txt Outdated
# (recommended) to less performant backends. The order is as follows:
#
# 1. MKLDNN (requires and checks for MKL or MKLML)
# 2. MKL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(must register, download, and install separately)
link to it?

CMakeLists.txt Outdated
# Apple's mathematical framework, probably the best choice on a Mac if MKL/MKLML/MKLDNN
# are not available.
# https://developer.apple.com/documentation/accelerate
mxnet_option(USE_ACCELERATE_IF_AVAILABLE "Use Apple Accelerate framework if found, \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get it now, after digging in, but on first glance, I would confuse this with a generic acceleration option, and say, "YEAH! Turn that one on for sure!", and not realize it is for Apple. What if you called it USE_APPLE_ACCELERATE_IF_AVAILABLE?

if(NOT MKL_FOUND)
function(switch_lapack enable)
if(${enable})
message(STATUS "Enabling lapack functionality")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LAPACK

find_library(Atlas_LAPACK_LIBRARY NAMES alapack_r alapack lapack_atlas PATHS ${Atlas_LIB_SEARCH_PATHS})
set(LOOKED_FOR ${LOOKED_FOR} Atlas_CLAPACK_INCLUDE_DIR Atlas_LAPACK_LIBRARY)
endif(Atlas_NEED_LAPACK)
message(STATUS "Looking for lapack support...")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LAPACK

Atlas_CLAPACK_INCLUDE_DIR
Atlas_LAPACK_LIBRARY)

message(STATUS "Lapack found")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LAPACK

message(STATUS "Lapack found")
else()
set(Atlas_LAPACK_FOUND False)
message(WARNING "Lapack with Atlas could not be found, lapack functionality will not be available")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LAPACK x2

)

if(OpenBLAS_NEED_LAPACK)
message(STATUS "Looking for lapack support...")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LAPACK

message(STATUS "Lapack found")
else()
set(OpenBLAS_LAPACK_FOUND False)
message(WARNING "OpenBlas has not been compiled with Lapack support, lapack functionality will not be available")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LAPACK x2

@lebeg
Copy link
Contributor Author

lebeg commented Jul 16, 2018

Thanks! Yes, probably building from source section would be the right place, what do you think?

Copy link
Member

@szha szha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lebeg thanks for the overhaul on cmake. Please don't carry out #8702 just yet as the deprecation may break builds.

@lebeg
Copy link
Contributor Author

lebeg commented Jul 17, 2018

@szha Yes, of course. Actually the cake scripts are far from being ready for the switch. This change is just a step for resolving #8702

@lupesko
Copy link
Contributor

lupesko commented Aug 6, 2018

@lebeg - are you planning to fix the issues that fail the tests?

@lebeg
Copy link
Contributor Author

lebeg commented Aug 8, 2018

@lupesko sure, just came back from vacation, working on the fixes

Copy link
Contributor

@larroy larroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR! Are the changes to random in cpp code intended?


# workaround to store CMAKE_CROSSCOMPILING because is getting reset by the project command
if(CMAKE_CROSSCOMPILING)
set(__CMAKE_CROSSCOMPILING ${CMAKE_CROSSCOMPILING})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! And good comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks)

message(STATUS "CMAKE_SYSTEM_NAME ${CMAKE_SYSTEM_NAME}")
# Choose BLAS (Basic Linear Algebra Subprograms) computation libraries

# MXNet supports multiple mathematical backends for computations on the CPU:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome documentation!

CMakeLists.txt Outdated
if(MSVC)
set(SYSTEM_ARCHITECTURE x86_64)
else()
execute_process(COMMAND uname -m COMMAND tr -d '\n' OUTPUT_VARIABLE SYSTEM_ARCHITECTURE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we are cross compiling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That supposed to be the host architecture. It's used to not try to download MKLML in case we are on an arm platform.

Copy link
Contributor

@larroy larroy Aug 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we cross compile for ARM in the host...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For cross compilation we use the CMAKE_CROSSCOMPILING flag to prevent a download

@lebeg
Copy link
Contributor Author

lebeg commented Aug 9, 2018

Yes, the <random> code was necessary due to android armv7 clang build not having the rand_r function at all. Actually using rand_r in the way it was used is dangerous in a multithreaded environment. This change fixes this.

@lebeg
Copy link
Contributor Author

lebeg commented Aug 9, 2018

@larroy @szha @aaronmarkham @marcoabreu @anirudh2290 @piiswrong @lupesko @nswamy could you have another look? Thanks!

endif()

if(NOT MKL_FOUND)
message(WARNING "MKLDNN requires either MKL or MKLML, MKLDNN will not be available")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding was that MKLDNN could be installed stand-alone, without any other MKL flavours.

Copy link
Contributor Author

@lebeg lebeg Aug 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had actually the opposite opinion, but after taking a closer look I've realised that this is true. I will make the needed changes.

@lebeg
Copy link
Contributor Author

lebeg commented Nov 22, 2018

This PR needs to be separated and, as I have stated in on the @dev list, refactored to used as much built-in cmake functionality as possible.

@lebeg lebeg closed this Nov 22, 2018
azai91 pushed a commit to azai91/incubator-mxnet that referenced this pull request Dec 1, 2018
* update mkml

* refine DownloadMKLML.cmake

* merge DownloadMKLML.cmake from apache#11148

* fix mkldnn release version

* fix windows compilation
@stu1130
Copy link
Contributor

stu1130 commented Mar 15, 2019

@lebeg what would the plan for next step on this PR?

@lebeg
Copy link
Contributor Author

lebeg commented Mar 18, 2019

@stu1130 I'm out of capacity for it right now, I would be glad if anyone has interest in taking it over. I would be happy to make reviews and provide support as much as I can.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-testing PR is reviewed and waiting CI build and test
Projects
None yet
Development

Successfully merging this pull request may close these issues.