Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-372] Add build flag for USE_F16C in CMake and clarify flag in make #10760

Merged
merged 19 commits into from
May 2, 2018

Conversation

rahul003
Copy link
Member

@rahul003 rahul003 commented May 1, 2018

Description

  • Added build flag for USE_F16C in CMake because the CMake for mshadow is not included in certain cmake code paths
  • Added the same flag in config.mk

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@rahul003 rahul003 requested a review from szha as a code owner May 1, 2018 01:07
CMakeLists.txt Outdated
@@ -102,6 +103,10 @@ else(MSVC)
else()
set(SUPPORT_MSSE2 FALSE)
endif()
# If USE_F16C set to ON, mshadow checks support
if(NOT USE_F16C)
set(SUPPORT_MF16C FALSE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please print a message if this is getting disabled

@@ -132,10 +132,19 @@ endif
ARCH := $(shell uname -a)
ifneq (,$(filter $(ARCH), armv6l armv7l powerpc64le ppc64le aarch64))
USE_SSE=0
USE_F16C=0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please print a message if this is getting disabled

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not the behavior of any make flag though. Cmake does do that. Also for those devices there, F16C instruction set is absent. So it's not really a choice

Copy link
Contributor

@marcoabreu marcoabreu May 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this is basically the first case, right? But I guess you're right, so it's fine that way.

CMakeLists.txt Outdated
@@ -102,6 +104,29 @@ else(MSVC)
else()
set(SUPPORT_MSSE2 FALSE)
endif()
# For cross complication, turn off flag if target device does not support it
if(USE_F16C AND NOT MSVC)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is already guaranteed that MSVC is not present at this point

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah corrected it thanks

@apache apache deleted a comment from rahul003 May 1, 2018
@@ -86,6 +87,8 @@ if(MSVC)
add_definitions(-DNNVM_EXPORTS)
add_definitions(-DDMLC_STRICT_CXX11)
add_definitions(-DNOMINMAX)
set(SUPPORT_F16C FALSE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question might be stupid, but why are we disabling support on Windows? If this is a processor instruction, why does it depend on the OS?

Copy link
Member Author

@rahul003 rahul003 May 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually the MSVC (visual studio compiler) that's not yet supported. That doesn't support the current usage of f16c functions. One of the problems being that it does not have any specific link flag for f16c and the header file used does not exist. Future work, I'll see whether there are analogous features. Updated the message.
To use the feature we need not just processor, but also compiler support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaah I see, that makes sense. Thanks for elaborating

# For distributed training with fp16, this helps even if training on GPUs
# If left empty, checks CPU support and turns it on.
# For cross compilation, please check support for F16C on target device and turn off if necessary.
USE_F16C =
Copy link
Member

@anirudh2290 anirudh2290 May 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't we adding similar logic like CMake for F16C support to Makefile ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is in mshadow's makefile. I didn't add it here because it is always included in all code paths.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we should try to do most work (including whether hardware has FP16 support, and compiler has Fp16 support) in MXNet itself and pass the flag to Mshadow, so that we reduce the dependency on mshadow. This way MXNet users won't be impacted if something in MShadow breaks. WDYT ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is hard to pull off, considering it's a compile flag, and overcomplicates our build-process even further. As @hen pointed out on dev@, we should rather find a more permanent solution rather than trying to add even more hacks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rahul003 @marcoabreu I am okay with leaving this for now if its too much effort. Having said that, I think this stop-gap solution is better than what we have today if it means less pain for the user until we move to the more permanent solution. Considering that permanent solution (or atleast one of them) would be to move all the dmlc projects under apache (so that MXNet has better control on the repos), I don't see it happening anytime soon.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the logic of identifying the flag here in MXNet. MShadow's role now is to use that flag if defined and set the compiler flag. This means we have control over when to turn on or off the flag in MXNet

COMMAND grep F16C
OUTPUT_VARIABLE CPU_SUPPORT_F16C)
endif()
if(NOT CPU_SUPPORT_F16C)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is an unitiliazed variable automatically assumed as FALSE?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never set SUPPORT_MF16C explicitely to FALSE

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's right

@anirudh2290 anirudh2290 merged commit 1822dac into apache:master May 2, 2018
anirudh2290 pushed a commit to anirudh2290/mxnet that referenced this pull request May 7, 2018
…make (apache#10760)

* add comments to makefile config

* add f16c check in mxnet

* update cmake

* clarify

* small updates

* add message

* add message

* update msvc message

* update mshadow

* typo

* only print message for MSVC if USE_F16C

* improve build logic

* update mshadow

* remove def from amalgamation makefile

* trigger CI
@rahul003 rahul003 deleted the f16c-build branch May 8, 2018 21:15
rahul003 added a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
…make (apache#10760)

* add comments to makefile config

* add f16c check in mxnet

* update cmake

* clarify

* small updates

* add message

* add message

* update msvc message

* update mshadow

* typo

* only print message for MSVC if USE_F16C

* improve build logic

* update mshadow

* remove def from amalgamation makefile

* trigger CI
zheng-da pushed a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 28, 2018
…make (apache#10760)

* add comments to makefile config

* add f16c check in mxnet

* update cmake

* clarify

* small updates

* add message

* add message

* update msvc message

* update mshadow

* typo

* only print message for MSVC if USE_F16C

* improve build logic

* update mshadow

* remove def from amalgamation makefile

* trigger CI
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants