-
Notifications
You must be signed in to change notification settings - Fork 5
Make EESSI-extend
support accelerator installations
#27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Strict installation path checking is enforced by EESSI for EESSI and site | ||
installations involving accelerators. In these cases, if you wish to create an | ||
accelerator installation you must set the environement variable | ||
EESSI_ACCELERATOR_INSTALL (and load/reload this module). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new environment variable has an impact on the build scripts, it needs to be set in the scenario where we expect to do an accelerator installation
@@ -29,8 +29,8 @@ fi | |||
if [ ! -z ${EESSI_SOFTWARE_SUBDIR_OVERRIDE} ]; then | |||
INPUT="export EESSI_SOFTWARE_SUBDIR_OVERRIDE=${EESSI_SOFTWARE_SUBDIR_OVERRIDE}; ${INPUT}" | |||
fi | |||
if [ ! -z ${EESSI_ACCELERATOR_TARGET} ]; then | |||
INPUT="export EESSI_ACCELERATOR_TARGET=${EESSI_ACCELERATOR_TARGET}; ${INPUT}" | |||
if [ ! -z ${EESSI_ACCELERATOR_TARGET_OVERRIDE} ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trz42 This is why I was asking about where these environment variables get set, this should be using the override mechanism
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
module load EESSI-extend/${{matrix.EESSI_VERSION}}-easybuild | ||
check_env_var "EASYBUILD_INSTALLPATH" "$EESSI_SOFTWARE_PATH" # installation path should be the same unless we ask for an explicit GPU installation | ||
check_env_var "EASYBUILD_CUDA_COMPUTE_CAPABILITIES" "$STORED_CUDA_CC" | ||
export EESSI_ACCELERATOR_INSTALL=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is this variable (EESSI_ACCELERATOR_INSTALL
) and EESSI_ACCELERATOR_TARGET_OVERRIDE
that need to be set by the bot in order to configure EESSI-extend
correctly for a GPU installation
if (eessi_accelerator_target ~= nil) then | ||
cuda_compute_capability = string.match(eessi_accelerator_target, "^nvidia/cc([0-9][0-9])$") | ||
if (cuda_compute_capability ~= nil) then | ||
easybuild_installpath = pathJoin(easybuild_installpath, 'accel', eessi_accelerator_target) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was actually wrong, archdetect
returns paths like accel/nvidia/cc80
(see https://github.com/EESSI/software-layer-scripts/blob/main/tests/archdetect/nvidia-smi/1xa100.output), but we were consistent in our error (the build script set EESSI_ACCELERATOR_TARGET
incorrectly rather than set EESSI_ACCELERATOR_TARGET_OVERRIDE
which would have affected the behaviour of archdetect
).
export EESSI_ACCELERATOR_TARGET=$(cfg_get_value "architecture" "accelerator") | ||
echo "bot/build.sh: EESSI_ACCELERATOR_TARGET='${EESSI_ACCELERATOR_TARGET}'" | ||
export EESSI_ACCELERATOR_TARGET_OVERRIDE="accel/$(cfg_get_value architecture accelerator)" | ||
echo "bot/build.sh: EESSI_ACCELERATOR_TARGET_OVERRIDE='${EESSI_ACCELERATOR_TARGET_OVERRIDE}'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if [[ -n "$EESSI_ACCELERATOR_TARGET_OVERRIDE" && -z "$EESSI_ACCELERATOR_TARGET" ]]; then | ||
fatal_error "EESSI module should've set EESSI_ACCELERATOR_TARGET when EESSI_ACCELERATOR_TARGET_OVERRIDE exported." >&2 | ||
elif [[ -n "$EESSI_ACCELERATOR_TARGET_OVERRIDE" ]]; then | ||
export EESSI_ACCELERATOR_INSTALL=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen2 |
New job on instance
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen2 |
New job on instance
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen2 |
New job on instance
|
…scripts into eessi-extend-cuda
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen2 |
New job on instance
|
Co-authored-by: Bob Dröge <b.e.droge@rug.nl>
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen2 |
New job on instance
|
bot: build repo:eessi.io-2023.06-software instance:eessi-bot-mc-aws arch:x86_64/amd/zen3 |
New job on instance
|
New job on instance
|
New job on instance
|
New job on instance
|
New job on instance
|
New job on instance
|
New job on instance
|
New job on instance
|
New job on instance
|
New job on instance
|
New job on instance
|
New job on instance
|
No description provided.