Skip to content

Add option to enable "sve" optimization level on armv9 #830

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: branch-25.06
Choose a base branch
from

Conversation

LizYou
Copy link

@LizYou LizYou commented Apr 21, 2025

No description provided.

@LizYou LizYou requested a review from a team as a code owner April 21, 2025 07:02
Copy link

copy-pr-bot bot commented Apr 21, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Lizhen You <lyou@nvidia.com>

# Check for SVE support
message("Checking for SVE support")
SET(CMAKE_REQUIRED_FLAGS "-march=armv9-a+sve")
Copy link
Contributor

@bdice bdice Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use lowercase for the CMake function names, like set(...), if(...), else(), mark_as_advanced(...).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Will update

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found the FindAVX.cmake is using the uppercase as well; So it is reasonable to use uppercase for FindSVE.cmake for code consistency?

in FindAVX.cmake:

ACRO(CHECK_SSE lang type flags)
SET(__FLAG_I 1)
SET(CMAKE_REQUIRED_FLAGS_SAVE ${CMAKE_REQUIRED_FLAGS})
FOREACH(__FLAG ${flags})
IF(NOT ${lang}${type}FOUND)
SET(CMAKE_REQUIRED_FLAGS ${__FLAG})
CHECK_CXX_SOURCE_RUNS("${${type}CODE}" ${lang}HAS${type}${__FLAG_I})
IF(${lang}HAS${type}
${__FLAG_I})
SET(${lang}
${type}FOUND
TRUE
CACHE BOOL "${lang} ${type} support"
)
SET(${lang}
${type}_FLAGS
"${__FLAG}"
CACHE STRING "${lang} ${type} flags"
)
ENDIF()
MATH(EXPR __FLAG_I "${__FLAG_I}+1")
ENDIF()
ENDFOREACH()

Copy link
Contributor

@bdice bdice Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's prefer "good CMake style" over consistency here, and go lower-case. Maybe as a follow-up we can fix FindAVX.cmake.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Pls double-check

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the latest code look good to you? The "sve" optimization flag can enhance the performance on AArch64 machine significantly which is around 17% in my run. Hope we can integrate this flag in soon time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LizYou please see my comment below about the style checker (git pre-commit hooks). This PR cannot be merged while the style checker is not passing.

@cjnolet cjnolet added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Apr 22, 2025
@cjnolet
Copy link
Member

cjnolet commented Apr 22, 2025

/ok to test 0d939b1

@cjnolet
Copy link
Member

cjnolet commented Apr 23, 2025

@LizYou it looks like we have some style checker errors. We use Git pre-commit hooks to automatically format code changes so they pass the style checker upon commit. Please see the following section of out contributor guide for information on how to configure pre-commit hooks.

@cjnolet
Copy link
Member

cjnolet commented Apr 24, 2025

/ok to test f4d80ce

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake cpp improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

3 participants