Skip to content

enable ARM NEON (128 bit) vector registers via compiler defined macros#421

Closed
roiser wants to merge 1 commit intomadgraph5:masterfrom
roiser:arm_neon
Closed

enable ARM NEON (128 bit) vector registers via compiler defined macros#421
roiser wants to merge 1 commit intomadgraph5:masterfrom
roiser:arm_neon

Conversation

@roiser
Copy link
Member

@roiser roiser commented Apr 4, 2022

enable ARM NEON

@valassi
Copy link
Member

valassi commented Apr 8, 2022

Hi @roiser I added several comments about this in #221.

Bottom line:

  • it presently does not support a way to disable SIMD (ie the 'none' option)
  • it addresses the sam eissues as 128-bit vsx on Power9, but in a different way
  • for these reasons I would modify this to use the same approach as on Power9 (ie add -D__SSE42__ if you want 128bit neon vectors, and add nothing if you want to keep no simd)
  • later on this should probably be changed to use a define like -Dmg_prefer_vector_width128 instead of -D__sse42, and use that on all of intel, power9, arm

@valassi
Copy link
Member

valassi commented Apr 8, 2022

Hi @roiser I am closing this one and replacing it by PR #425. Please have a look (but still in WIP)... thanks!

@valassi valassi closed this Apr 8, 2022
@valassi
Copy link
Member

valassi commented Apr 9, 2022

I have now merged #425 which includes and replaces this #421.

@roiser
Copy link
Member Author

roiser commented Apr 11, 2022

  • later on this should probably be changed to use a define like -Dmg_prefer_vector_width128 instead of -D__sse42, and use that on all of intel, power9, arm

On this point, yes I think we should go away from Intel specific names and use generic ones. All very fine with me, thanks

valassi added a commit to valassi/madgraph4gpu that referenced this pull request Apr 21, 2022
This includes the Mac-specific PR madgraph5#421 (including a patch for NEON)
valassi added a commit to mg5amcnlo/mg5amcnlo_cudacpp that referenced this pull request Aug 16, 2023
This includes the Mac-specific PR madgraph5/madgraph4gpu#421 (including a patch for NEON)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants