Patches for Mac (including ARM NEON 128 bits)#425
Merged
valassi merged 31 commits intomadgraph5:masterfrom Apr 9, 2022
Merged
Conversation
(cherry picked from commit a497e1a)
… of flags (eg on Darwin)
…sctl -n hw.logicalcpu`)
…revent errors from missing readelf)
…library, this removes the need to use rpaths!
…ac after using absolute LIBDIR pathas madgraph5#375
This was referenced Apr 8, 2022
Closed
Member
Author
|
This is still in WIP. Amngst the things that are missing:
|
…issues: in particular fcheck.exe crashes
…eck.exe, not to fcheck.exe!
…r message from fcheck!
…sing MacOS SIP (SIP would drop DYLD_LIBRARY_PATH inside the script if this is set outside the script) Example ./tput/teeThroughputX.sh -ggtt -dlp $(cd $(dirname $(gfortran --print-file-name libgfortran.dylib)); pwd) This now does proceed one step further, fcheck.exe does succeed!
…issue (python is not found, should use python3) This is ./tput/teeThroughputX.sh -ggtt -dlp $(cd $(dirname $(gfortran --print-file-name libgfortran.dylib)); pwd)
… almost all ok Still to do: improve handling of missing avx2, 512y, 512z builds
…o not check /proc/cpuinfo on Mac)
…ths (now rpath on linux and full paths on mac)
…2x/double and ~4x/float speedup from NEON SIMD
…t manual), regenerate ggtt auto
(NB: the summaryTable.sh script was executed on Linux: it fails on Mac?...)
Member
Author
|
This is now complete (no longer WIP). Amongst the things that were added:
This is a summary table from the tests of all five processes double/float on Mac M1 ARMv8 with and without NEON SIMD. One nicely sees factors close to 2x and 4x for double and float |
Member
Author
|
All checks have passed. I am self merging. @roiser, I will need to merge this into your alphas patch, I hope this does not cause issues. (I actually started from there, I saw that the alphas patch includes the Mac NEON patch, so I wanted to check it and merge it standalone first). |
Closed
valassi
added a commit
to valassi/madgraph4gpu
that referenced
this pull request
Apr 21, 2022
… which I will now merge Revert "enable ARM NEON (128 bit) vector registers via compiler defined macros" This reverts commit a497e1a.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a WIP comprehensive patch for Mac specific issues (#375).
It also includes, modifies and supersedes Stefan's PS #421. For more detailed comments why this was necessary, see #221.