Skip to content

Conversation

@msimberg
Copy link

This is the test recipe I've been using to have a newer NCCL available for testing with icon, and since it required a bit of ugliness I thought I'd open this as a reference for when the spack 1.0 transition comes officially for c2sm. I'm not looking to have this merged.

In terms of packages, this contains:

  • nvhpc 25.7 (nvhpc 25.9 would contain CUDA 12.9 and 13.0, some testing would be required to make sure 12.9 is used, or that 13.0 works)
  • gcc 13 (GHEX, dependency of icon4py, currently does not build with gcc 14)
  • nccl 2.28, aws-ofi-nccl 1.17, libfabric 2.3 (these are all required for decent nccl performance; libfabric 2.3 from upstream may also slightly improve performance for MPI or improve stability)

Ugliness:

  • I needed to add the concretizer:duplicates:strategy:full option to get the recipe to concretize (default is minimal, see the PR adding support to stackinator for some more details: Concretizer duplicates strategy eth-cscs/stackinator#269); I hope this is a bug that might be resolved in newer spack versions but it's not clear to me yet if that's the case
  • The above option means that a lot of packages concretize with nvhpc as the compiler; I had to force many packages to use gcc explicitly
  • cdo did not build and since I didn't need it I just disabled it for now
  • CC, CXX, and FC are set to nonexistent paths; for now, unset them and use MPICC et al.
  • There's no libnvToolsExt.so library in nvhpc 25.7 (build scripts set -lnvToolsExt currently); I think it's not needed but have not tested any profiling

The changes required to the custom packages in the repo directory were minimal. I used upstream spack-packages instead of the spack-c2sm package repo.

In case anyone wants to use the environment I have a squashfs file in /capstor/store/cscs/cscs/csstaff/simbergm/icon-spack-1.0-test.squashfs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant