Skip to content

Conversation

@bedroge
Copy link
Collaborator

@bedroge bedroge commented Nov 10, 2025

For PyTorch we need the following:

27 out of 131 required modules missing:

* parameterized/0.9.0-GCCcore-13.3.0 (parameterized-0.9.0-GCCcore-13.3.0.eb)
* optree/0.14.1-GCCcore-13.3.0 (optree-0.14.1-GCCcore-13.3.0.eb)
* tlparse/0.3.37-GCCcore-13.3.0 (tlparse-0.3.37-GCCcore-13.3.0.eb)
* lxml/5.3.0-GCCcore-13.3.0 (lxml-5.3.0-GCCcore-13.3.0.eb)
* unittest-xml-reporting/3.1.0-GCCcore-13.3.0 (unittest-xml-reporting-3.1.0-GCCcore-13.3.0.eb)
* pytest-rerunfailures/15.0-GCCcore-13.3.0 (pytest-rerunfailures-15.0-GCCcore-13.3.0.eb)
* pytest-shard/0.1.2-GCCcore-13.3.0 (pytest-shard-0.1.2-GCCcore-13.3.0.eb)
* pytest-subtests/0.13.1-GCCcore-13.3.0 (pytest-subtests-0.13.1-GCCcore-13.3.0.eb)
* pytest-flakefinder/1.1.0-GCCcore-13.3.0 (pytest-flakefinder-1.1.0-GCCcore-13.3.0.eb)
* libyaml/0.2.5-GCCcore-13.3.0 (libyaml-0.2.5-GCCcore-13.3.0.eb)
* METIS/5.1.0-GCCcore-13.3.0 (METIS-5.1.0-GCCcore-13.3.0.eb)
* PyYAML/6.0.2-GCCcore-13.3.0 (PyYAML-6.0.2-GCCcore-13.3.0.eb)
* CoinUtils/2.11.12-GCC-13.3.0 (CoinUtils-2.11.12-GCC-13.3.0.eb)
* expecttest/0.2.1-GCCcore-13.3.0 (expecttest-0.2.1-GCCcore-13.3.0.eb)
* Osi/0.108.11-GCC-13.3.0 (Osi-0.108.11-GCC-13.3.0.eb)
* networkx/3.4.2-gfbf-2024a (networkx-3.4.2-gfbf-2024a.eb)
* Z3/4.13.0-GCCcore-13.3.0 (Z3-4.13.0-GCCcore-13.3.0.eb)
* MPC/1.3.1-GCCcore-13.3.0 (MPC-1.3.1-GCCcore-13.3.0.eb)
* gmpy2/2.2.0-GCCcore-13.3.0 (gmpy2-2.2.0-GCCcore-13.3.0.eb)
* SCOTCH/7.0.6-gompi-2024a (SCOTCH-7.0.6-gompi-2024a.eb)
* sympy/1.13.3-gfbf-2024a (sympy-1.13.3-gfbf-2024a.eb)
* MUMPS/5.7.2-foss-2024a-metis (MUMPS-5.7.2-foss-2024a-metis.eb)
* Clp/1.17.10-foss-2024a (Clp-1.17.10-foss-2024a.eb)
* Cgl/0.60.8-foss-2024a (Cgl-0.60.8-foss-2024a.eb)
* Cbc/2.10.12-foss-2024a (Cbc-2.10.12-foss-2024a.eb)
* PuLP/2.8.0-foss-2024a (PuLP-2.8.0-foss-2024a.eb)
* PyTorch/2.6.0-foss-2024a (PyTorch-2.6.0-foss-2024a.eb)

As I'm expecting issues with PyTorch itself, let's build its missing dependencies first 😆

@bedroge bedroge added the 2025.06-software.eessi.io 2025.06 version of software.eessi.io label Nov 10, 2025
@bedroge
Copy link
Collaborator Author

bedroge commented Nov 10, 2025

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws for:arch=x86_64/amd/zen2
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-jsc for:arch=aarch64/nvidia/grace

@eessi-bot-aws
Copy link

eessi-bot-aws bot commented Nov 10, 2025

New job on instance eessi-bot-mc-aws for repository eessi.io-2025.06-software
Building on: amd-zen2
Building for: x86_64/amd/zen2
Job dir: /project/def-users/SHARED/jobs/2025.11/pr_1294/103214

date job status comment
Nov 10 08:49:57 UTC 2025 submitted job id 103214 awaits release by job manager
Nov 10 08:50:21 UTC 2025 released job awaits launch by Slurm scheduler
Nov 10 08:56:23 UTC 2025 running job 103214 is running
Nov 10 10:23:08 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-103214.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2025.06-software-linux-x86_64-amd-zen2-17627700520.tar.gzsize: 76 MiB (80661682 bytes)
entries: 6073
modules under 2025.06/software/linux/x86_64/amd/zen2/modules/all
CoinUtils/2.11.12-GCC-13.3.0.lua
expecttest/0.2.1-GCCcore-13.3.0.lua
gmpy2/2.2.0-GCCcore-13.3.0.lua
libyaml/0.2.5-GCCcore-13.3.0.lua
lxml/5.3.0-GCCcore-13.3.0.lua
METIS/5.1.0-GCCcore-13.3.0.lua
MPC/1.3.1-GCCcore-13.3.0.lua
MUMPS/5.7.2-foss-2024a-metis.lua
networkx/3.4.2-gfbf-2024a.lua
optree/0.14.1-GCCcore-13.3.0.lua
Osi/0.108.11-GCC-13.3.0.lua
parameterized/0.9.0-GCCcore-13.3.0.lua
pytest-flakefinder/1.1.0-GCCcore-13.3.0.lua
pytest-rerunfailures/15.0-GCCcore-13.3.0.lua
pytest-shard/0.1.2-GCCcore-13.3.0.lua
pytest-subtests/0.13.1-GCCcore-13.3.0.lua
PyYAML/6.0.2-GCCcore-13.3.0.lua
SCOTCH/7.0.6-gompi-2024a.lua
sympy/1.13.3-gfbf-2024a.lua
tlparse/0.3.37-GCCcore-13.3.0.lua
unittest-xml-reporting/3.1.0-GCCcore-13.3.0.lua
Z3/4.13.0-GCCcore-13.3.0.lua
software under 2025.06/software/linux/x86_64/amd/zen2/software
CoinUtils/2.11.12-GCC-13.3.0
expecttest/0.2.1-GCCcore-13.3.0
gmpy2/2.2.0-GCCcore-13.3.0
libyaml/0.2.5-GCCcore-13.3.0
lxml/5.3.0-GCCcore-13.3.0
METIS/5.1.0-GCCcore-13.3.0
MPC/1.3.1-GCCcore-13.3.0
MUMPS/5.7.2-foss-2024a-metis
networkx/3.4.2-gfbf-2024a
optree/0.14.1-GCCcore-13.3.0
Osi/0.108.11-GCC-13.3.0
parameterized/0.9.0-GCCcore-13.3.0
pytest-flakefinder/1.1.0-GCCcore-13.3.0
pytest-rerunfailures/15.0-GCCcore-13.3.0
pytest-shard/0.1.2-GCCcore-13.3.0
pytest-subtests/0.13.1-GCCcore-13.3.0
PyYAML/6.0.2-GCCcore-13.3.0
SCOTCH/7.0.6-gompi-2024a
sympy/1.13.3-gfbf-2024a
tlparse/0.3.37-GCCcore-13.3.0
unittest-xml-reporting/3.1.0-GCCcore-13.3.0
Z3/4.13.0-GCCcore-13.3.0
reprod directories under 2025.06/software/linux/x86_64/amd/zen2/reprod
CoinUtils/2.11.12-GCC-13.3.0/20251110_100116UTC
expecttest/0.2.1-GCCcore-13.3.0/20251110_090628UTC
gmpy2/2.2.0-GCCcore-13.3.0/20251110_091944UTC
libyaml/0.2.5-GCCcore-13.3.0/20251110_090536UTC
lxml/5.3.0-GCCcore-13.3.0/20251110_090315UTC
METIS/5.1.0-GCCcore-13.3.0/20251110_095936UTC
MPC/1.3.1-GCCcore-13.3.0/20251110_091913UTC
MUMPS/5.7.2-foss-2024a-metis/20251110_101543UTC
networkx/3.4.2-gfbf-2024a/20251110_090657UTC
optree/0.14.1-GCCcore-13.3.0/20251110_085824UTC
Osi/0.108.11-GCC-13.3.0/20251110_101705UTC
parameterized/0.9.0-GCCcore-13.3.0/20251110_085736UTC
pytest-flakefinder/1.1.0-GCCcore-13.3.0/20251110_090501UTC
pytest-rerunfailures/15.0-GCCcore-13.3.0/20251110_090356UTC
pytest-shard/0.1.2-GCCcore-13.3.0/20251110_090417UTC
pytest-subtests/0.13.1-GCCcore-13.3.0/20251110_090439UTC
PyYAML/6.0.2-GCCcore-13.3.0/20251110_090610UTC
SCOTCH/7.0.6-gompi-2024a/20251110_100620UTC
sympy/1.13.3-gfbf-2024a/20251110_095830UTC
tlparse/0.3.37-GCCcore-13.3.0/20251110_090010UTC
unittest-xml-reporting/3.1.0-GCCcore-13.3.0/20251110_090331UTC
Z3/4.13.0-GCCcore-13.3.0/20251110_091512UTC
other under 2025.06/software/linux/x86_64/amd/zen2
no other files in tarball
Nov 10 10:23:08 UTC 2025 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite produced failures.
ReFrame Summary
[ FAILED ] Ran 4/4 test case(s) from 4 check(s) (4 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-103214.out
❌ found message matching ERROR:
❌ found message matching [\s*FAILED\s*].*Ran .* test case

@eessi-bot-jsc
Copy link

eessi-bot-jsc bot commented Nov 10, 2025

New job on instance eessi-bot-jsc for repository eessi.io-2025.06-software
Building on: nvidia-grace
Building for: aarch64/nvidia/grace
Job dir: /p/project1/ceasybuilders/eessibot/jobs/2025.11/pr_1294/14199038

date job status comment
Nov 10 08:49:58 UTC 2025 submitted job id 14199038 awaits release by job manager
Nov 10 08:50:49 UTC 2025 released job awaits launch by Slurm scheduler
Nov 10 08:51:53 UTC 2025 running job 14199038 is running
Nov 10 09:57:55 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-14199038.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2025.06-software-linux-aarch64-nvidia-grace-17627683030.tar.gzsize: 74 MiB (78054644 bytes)
entries: 6073
modules under 2025.06/software/linux/aarch64/nvidia/grace/modules/all
CoinUtils/2.11.12-GCC-13.3.0.lua
expecttest/0.2.1-GCCcore-13.3.0.lua
gmpy2/2.2.0-GCCcore-13.3.0.lua
libyaml/0.2.5-GCCcore-13.3.0.lua
lxml/5.3.0-GCCcore-13.3.0.lua
METIS/5.1.0-GCCcore-13.3.0.lua
MPC/1.3.1-GCCcore-13.3.0.lua
MUMPS/5.7.2-foss-2024a-metis.lua
networkx/3.4.2-gfbf-2024a.lua
optree/0.14.1-GCCcore-13.3.0.lua
Osi/0.108.11-GCC-13.3.0.lua
parameterized/0.9.0-GCCcore-13.3.0.lua
pytest-flakefinder/1.1.0-GCCcore-13.3.0.lua
pytest-rerunfailures/15.0-GCCcore-13.3.0.lua
pytest-shard/0.1.2-GCCcore-13.3.0.lua
pytest-subtests/0.13.1-GCCcore-13.3.0.lua
PyYAML/6.0.2-GCCcore-13.3.0.lua
SCOTCH/7.0.6-gompi-2024a.lua
sympy/1.13.3-gfbf-2024a.lua
tlparse/0.3.37-GCCcore-13.3.0.lua
unittest-xml-reporting/3.1.0-GCCcore-13.3.0.lua
Z3/4.13.0-GCCcore-13.3.0.lua
software under 2025.06/software/linux/aarch64/nvidia/grace/software
CoinUtils/2.11.12-GCC-13.3.0
expecttest/0.2.1-GCCcore-13.3.0
gmpy2/2.2.0-GCCcore-13.3.0
libyaml/0.2.5-GCCcore-13.3.0
lxml/5.3.0-GCCcore-13.3.0
METIS/5.1.0-GCCcore-13.3.0
MPC/1.3.1-GCCcore-13.3.0
MUMPS/5.7.2-foss-2024a-metis
networkx/3.4.2-gfbf-2024a
optree/0.14.1-GCCcore-13.3.0
Osi/0.108.11-GCC-13.3.0
parameterized/0.9.0-GCCcore-13.3.0
pytest-flakefinder/1.1.0-GCCcore-13.3.0
pytest-rerunfailures/15.0-GCCcore-13.3.0
pytest-shard/0.1.2-GCCcore-13.3.0
pytest-subtests/0.13.1-GCCcore-13.3.0
PyYAML/6.0.2-GCCcore-13.3.0
SCOTCH/7.0.6-gompi-2024a
sympy/1.13.3-gfbf-2024a
tlparse/0.3.37-GCCcore-13.3.0
unittest-xml-reporting/3.1.0-GCCcore-13.3.0
Z3/4.13.0-GCCcore-13.3.0
reprod directories under 2025.06/software/linux/aarch64/nvidia/grace/reprod
CoinUtils/2.11.12-GCC-13.3.0/20251110_094110UTC
expecttest/0.2.1-GCCcore-13.3.0/20251110_090914UTC
gmpy2/2.2.0-GCCcore-13.3.0/20251110_091412UTC
libyaml/0.2.5-GCCcore-13.3.0/20251110_090814UTC
lxml/5.3.0-GCCcore-13.3.0/20251110_090649UTC
METIS/5.1.0-GCCcore-13.3.0/20251110_094007UTC
MPC/1.3.1-GCCcore-13.3.0/20251110_091357UTC
MUMPS/5.7.2-foss-2024a-metis/20251110_094807UTC
networkx/3.4.2-gfbf-2024a/20251110_090956UTC
optree/0.14.1-GCCcore-13.3.0/20251110_085542UTC
Osi/0.108.11-GCC-13.3.0/20251110_094837UTC
parameterized/0.9.0-GCCcore-13.3.0/20251110_085325UTC
pytest-flakefinder/1.1.0-GCCcore-13.3.0/20251110_090754UTC
pytest-rerunfailures/15.0-GCCcore-13.3.0/20251110_090731UTC
pytest-shard/0.1.2-GCCcore-13.3.0/20251110_090738UTC
pytest-subtests/0.13.1-GCCcore-13.3.0/20251110_090746UTC
PyYAML/6.0.2-GCCcore-13.3.0/20251110_090900UTC
SCOTCH/7.0.6-gompi-2024a/20251110_094309UTC
sympy/1.13.3-gfbf-2024a/20251110_093948UTC
tlparse/0.3.37-GCCcore-13.3.0/20251110_090413UTC
unittest-xml-reporting/3.1.0-GCCcore-13.3.0/20251110_090655UTC
Z3/4.13.0-GCCcore-13.3.0/20251110_091234UTC
other under 2025.06/software/linux/aarch64/nvidia/grace
no other files in tarball
Nov 10 09:57:55 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] (1/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /e4bf9965 @BotBuildTests:aarch64_nvidia_grace+default
P: latency: 2.52 us (r:0, l:None, u:None)
[ OK ] (2/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /3da4890b @BotBuildTests:aarch64_nvidia_grace+default
P: latency: 6.2 us (r:0, l:None, u:None)
[ OK ] (3/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /3255009a @BotBuildTests:aarch64_nvidia_grace+default
P: latency: 0.25 us (r:0, l:None, u:None)
[ OK ] (4/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /59f4b331 @BotBuildTests:aarch64_nvidia_grace+default
P: bandwidth: 23607.22 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 4/4 test case(s) from 4 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-14199038.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@bedroge
Copy link
Collaborator Author

bedroge commented Nov 10, 2025

Error when building Clp:

ClpDualRowDantzig.cpp:6:10: fatal error: CoinPragma.hpp: No such file or directory

@ocaisa
Copy link
Member

ocaisa commented Nov 10, 2025

Error when building Clp:

ClpDualRowDantzig.cpp:6:10: fatal error: CoinPragma.hpp: No such file or directory

I suspect we are missing including the header directory at https://github.com/easybuilders/easybuild-easyconfigs/blob/develop/easybuild/easyconfigs/c/Clp/Clp-1.17.9-foss-2023b.eb#L46

It pops up now because it no longer appears in CPATH (due to our new EB settings)

@bedroge
Copy link
Collaborator Author

bedroge commented Nov 10, 2025

Error when building Clp:

ClpDualRowDantzig.cpp:6:10: fatal error: CoinPragma.hpp: No such file or directory

I suspect we are missing including the header directory at https://github.com/easybuilders/easybuild-easyconfigs/blob/develop/easybuild/easyconfigs/c/Clp/Clp-1.17.9-foss-2023b.eb#L46

It pops up now because it no longer appears in CPATH (due to our new EB settings)

I was checking it in an EB shell in the container, and it did add /cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/CoinUtils/2.11.12-GCC-13.3.0/include to C_INCLUDE_PATH and CPLUS_INCLUDE_PATH. However, that directory has a subdir coin that contains the header files, and Clp does #include "CoinPragma.hpp". So I think we may have to add the subdir instead (or patch Clp)?

edit: hmm, actually, that should already be done by:
https://github.com/easybuilders/easybuild-easyconfigs/blob/develop/easybuild/easyconfigs/c/CoinUtils/CoinUtils-2.11.12-GCC-13.3.0.eb#L32

edit2: the module that was built for CoinUtils does look fine:

prepend_path("CPLUS_INCLUDE_PATH", pathJoin(root, "include"))
prepend_path("CPLUS_INCLUDE_PATH", pathJoin(root, "include", "coin"))

But for some reason CPLUS_INCLUDE_PATH during the build only has the first path.

The last entry in the log about CPLUS_INCLUDE_PATH does still show the correct value (this is right before the configure step):

== 2025-11-10 13:33:50,527 environment.py:95 INFO Environment variable CPLUS_INCLUDE_PATH set to /cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/Osi/0.108.11-GCC-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/CoinUtils/2.11.12-GCC-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/MUMPS/5.7.2-foss-2024a-metis/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/METIS/5.1.0-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/pkgconf/2.2.0-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/FFTW.MPI/3.3.10-gompi-2024a/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/FlexiBLAS/3.4.4-GCC-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/FlexiBLAS/3.4.4-GCC-13.3.0/include/flexiblas (previous value: '/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/Osi/0.108.11-GCC-13.3.0/include/coin:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/Osi/0.108.11-GCC-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/CoinUtils/2.11.12-GCC-13.3.0/include/coin:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/CoinUtils/2.11.12-GCC-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/MUMPS/5.7.2-foss-2024a-metis/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/SCOTCH/7.0.6-gompi-2024a/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/METIS/5.1.0-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/pkgconf/2.2.0-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/libiconv/1.17-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/libtool/2.4.7-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/FFTW.MPI/3.3.10-gompi-2024a/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/FFTW/3.3.10-GCC-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/FlexiBLAS/3.4.4-GCC-13.3.0/include/flexiblas:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/FlexiBLAS/3.4.4-GCC-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/OpenBLAS/0.3.27-GCC-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/OpenMPI/5.0.3-GCC-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/UCC/1.3.0-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/PRRTE/3.0.5-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/PMIx/5.0.2-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/libfabric/1.21.0-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/UCX/1.16.0-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/libevent/2.1.12-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/OpenSSL/3/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/hwloc/2.10.0-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/libpciaccess/0.18.1-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/libxml2/2.12.7-GCCcore-13.3.0/include:/cvmfs/software.eessi.io/versions/2025.06/software/linux/x86_64/amd/zen2/software/numactl/2.0.18-GCCcore-13.3.0/include')

@bedroge
Copy link
Collaborator Author

bedroge commented Nov 13, 2025

The issue was reported and debugged a bit more in easybuilders/easybuild-framework#5039, and fixed by easybuilders/easybuild-framework#5041.

We now need to wait for EB 5.2.0, or backport the fix to the EB 5.1.2 easyconfig and do a rebuild of that version in EESSI.

@ocaisa
Copy link
Member

ocaisa commented Nov 13, 2025

I would be open to a rebuild similar to easybuilders/easybuild-easyconfigs#24376 (indeed they are quite related) as we don't know how long we will wait for a new release.

@bedroge
Copy link
Collaborator Author

bedroge commented Nov 19, 2025

EB 5.1.2 has been rebuilt with patches that should solve the issue (see #1304), so let's try again.

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-jsc for:arch=aarch64/nvidia/grace

@eessi-bot-jsc
Copy link

eessi-bot-jsc bot commented Nov 19, 2025

New job on instance eessi-bot-jsc for repository eessi.io-2025.06-software
Building on: nvidia-grace
Building for: aarch64/nvidia/grace
Job dir: /p/project1/ceasybuilders/eessibot/jobs/2025.11/pr_1294/14237101

date job status comment
Nov 19 19:17:17 UTC 2025 submitted job id 14237101 awaits release by job manager
Nov 19 19:18:02 UTC 2025 released job awaits launch by Slurm scheduler
Nov 19 19:19:05 UTC 2025 running job 14237101 is running
Nov 19 20:14:49 UTC 2025 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-14237101.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-aarch64-nvidia-grace-17635829680.tar.gzsize: 77 MiB (81249215 bytes)
entries: 6180
modules under 2025.06/software/linux/aarch64/nvidia/grace/modules/all
Clp/1.17.10-foss-2024a.lua
CoinUtils/2.11.12-GCC-13.3.0.lua
expecttest/0.2.1-GCCcore-13.3.0.lua
gmpy2/2.2.0-GCCcore-13.3.0.lua
libyaml/0.2.5-GCCcore-13.3.0.lua
lxml/5.3.0-GCCcore-13.3.0.lua
METIS/5.1.0-GCCcore-13.3.0.lua
MPC/1.3.1-GCCcore-13.3.0.lua
MUMPS/5.7.2-foss-2024a-metis.lua
networkx/3.4.2-gfbf-2024a.lua
optree/0.14.1-GCCcore-13.3.0.lua
Osi/0.108.11-GCC-13.3.0.lua
parameterized/0.9.0-GCCcore-13.3.0.lua
pytest-flakefinder/1.1.0-GCCcore-13.3.0.lua
pytest-rerunfailures/15.0-GCCcore-13.3.0.lua
pytest-shard/0.1.2-GCCcore-13.3.0.lua
pytest-subtests/0.13.1-GCCcore-13.3.0.lua
PyYAML/6.0.2-GCCcore-13.3.0.lua
SCOTCH/7.0.6-gompi-2024a.lua
sympy/1.13.3-gfbf-2024a.lua
tlparse/0.3.37-GCCcore-13.3.0.lua
unittest-xml-reporting/3.1.0-GCCcore-13.3.0.lua
Z3/4.13.0-GCCcore-13.3.0.lua
software under 2025.06/software/linux/aarch64/nvidia/grace/software
Clp/1.17.10-foss-2024a
CoinUtils/2.11.12-GCC-13.3.0
expecttest/0.2.1-GCCcore-13.3.0
gmpy2/2.2.0-GCCcore-13.3.0
libyaml/0.2.5-GCCcore-13.3.0
lxml/5.3.0-GCCcore-13.3.0
METIS/5.1.0-GCCcore-13.3.0
MPC/1.3.1-GCCcore-13.3.0
MUMPS/5.7.2-foss-2024a-metis
networkx/3.4.2-gfbf-2024a
optree/0.14.1-GCCcore-13.3.0
Osi/0.108.11-GCC-13.3.0
parameterized/0.9.0-GCCcore-13.3.0
pytest-flakefinder/1.1.0-GCCcore-13.3.0
pytest-rerunfailures/15.0-GCCcore-13.3.0
pytest-shard/0.1.2-GCCcore-13.3.0
pytest-subtests/0.13.1-GCCcore-13.3.0
PyYAML/6.0.2-GCCcore-13.3.0
SCOTCH/7.0.6-gompi-2024a
sympy/1.13.3-gfbf-2024a
tlparse/0.3.37-GCCcore-13.3.0
unittest-xml-reporting/3.1.0-GCCcore-13.3.0
Z3/4.13.0-GCCcore-13.3.0
reprod directories under 2025.06/software/linux/aarch64/nvidia/grace/reprod
Clp/1.17.10-foss-2024a/20251119_200511UTC
CoinUtils/2.11.12-GCC-13.3.0/20251119_195656UTC
expecttest/0.2.1-GCCcore-13.3.0/20251119_193022UTC
gmpy2/2.2.0-GCCcore-13.3.0/20251119_193455UTC
libyaml/0.2.5-GCCcore-13.3.0/20251119_192948UTC
lxml/5.3.0-GCCcore-13.3.0/20251119_192836UTC
METIS/5.1.0-GCCcore-13.3.0/20251119_195616UTC
MPC/1.3.1-GCCcore-13.3.0/20251119_193440UTC
MUMPS/5.7.2-foss-2024a-metis/20251119_200339UTC
networkx/3.4.2-gfbf-2024a/20251119_193041UTC
optree/0.14.1-GCCcore-13.3.0/20251119_192338UTC
Osi/0.108.11-GCC-13.3.0/20251119_200409UTC
parameterized/0.9.0-GCCcore-13.3.0/20251119_192059UTC
pytest-flakefinder/1.1.0-GCCcore-13.3.0/20251119_192933UTC
pytest-rerunfailures/15.0-GCCcore-13.3.0/20251119_192911UTC
pytest-shard/0.1.2-GCCcore-13.3.0/20251119_192918UTC
pytest-subtests/0.13.1-GCCcore-13.3.0/20251119_192926UTC
PyYAML/6.0.2-GCCcore-13.3.0/20251119_193012UTC
SCOTCH/7.0.6-gompi-2024a/20251119_195841UTC
sympy/1.13.3-gfbf-2024a/20251119_195558UTC
tlparse/0.3.37-GCCcore-13.3.0/20251119_192602UTC
unittest-xml-reporting/3.1.0-GCCcore-13.3.0/20251119_192842UTC
Z3/4.13.0-GCCcore-13.3.0/20251119_193320UTC
other under 2025.06/software/linux/aarch64/nvidia/grace
no other files in tarball
Nov 19 20:14:49 UTC 2025 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] (1/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /e4bf9965 @BotBuildTests:aarch64_nvidia_grace+default
P: latency: 2.51 us (r:0, l:None, u:None)
[ OK ] (2/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /3da4890b @BotBuildTests:aarch64_nvidia_grace+default
P: latency: 6.29 us (r:0, l:None, u:None)
[ OK ] (3/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /3255009a @BotBuildTests:aarch64_nvidia_grace+default
P: latency: 0.25 us (r:0, l:None, u:None)
[ OK ] (4/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /59f4b331 @BotBuildTests:aarch64_nvidia_grace+default
P: bandwidth: 23348.63 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 4/4 test case(s) from 4 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-14237101.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@bedroge
Copy link
Collaborator Author

bedroge commented Nov 19, 2025

Interesting, hitting a very similar issue to the one that @ocaisa had in #1299 (comment):

../../libtool: line 3232: cd: =/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/UCX/1.16.0-GCCcore-13.3.0/lib: No such file or directory
libtool: link: warning: cannot determine absolute directory name of `=/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/UCX/1.16.0-GCCcore-13.3.0/lib'
grep: =/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/UCX/1.16.0-GCCcore-13.3.0/lib/libuct.la: No such file or directory
/cvmfs/software.eessi.io/versions/2025.06/compat/linux/aarch64/bin/sed: can't read =/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/UCX/1.16.0-GCCcore-13.3.0/lib/libuct.la: No such file or directory
libtool: link: `=/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/UCX/1.16.0-GCCcore-13.3.0/lib/libuct.la' is not a valid libtool archive

Somehow that directory gets prefixed with a =.

This is the full command that seems to fail:

/bin/bash ../../libtool --tag=CXX --mode=link mpicxx  -O2 -ftree-vectorize -mcpu=native -fno-math-errno -fPIC   -DCGL_BUILD  -L/cvmfs/software.eessi.io/versions/2025.06/software/linux/
aarch64/nvidia/grace/software/ScaLAPACK/2.2.0-gompi-2024a-fb/lib64 -L/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/ScaLAPACK/2.2.0-gompi-2024a-
fb/lib -L/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/FFTW.MPI/3.3.10-gompi-2024a/lib64 -L/cvmfs/software.eessi.io/versions/2025.06/software/l
inux/aarch64/nvidia/grace/software/FFTW.MPI/3.3.10-gompi-2024a/lib -L/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/FFTW/3.3.10-GCC-13.3.0/lib64
 -L/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/FFTW/3.3.10-GCC-13.3.0/lib -L/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/
nvidia/grace/software/FlexiBLAS/3.4.4-GCC-13.3.0/lib64 -L/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/FlexiBLAS/3.4.4-GCC-13.3.0/lib -L/cvmfs/
software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/OpenMPI/5.0.3-GCC-13.3.0/lib64 -L/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia
/grace/software/OpenMPI/5.0.3-GCC-13.3.0/lib -L/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/GCCcore/13.3.0/lib64 -L/cvmfs/software.eessi.io/ve
rsions/2025.06/software/linux/aarch64/nvidia/grace/software/GCCcore/13.3.0/lib -L/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/Clp/1.17.10-foss
-2024a/lib64 -L/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/Clp/1.17.10-foss-2024a/lib -L/cvmfs/software.eessi.io/versions/2025.06/software/li
nux/aarch64/nvidia/grace/software/Osi/0.108.11-GCC-13.3.0/lib64 -L/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/Osi/0.108.11-GCC-13.3.0/lib -L/
cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/CoinUtils/2.11.12-GCC-13.3.0/lib64 -L/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarc
h64/nvidia/grace/software/CoinUtils/2.11.12-GCC-13.3.0/lib -L/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/pkgconf/2.2.0-GCCcore-13.3.0/lib64 -
L/cvmfs/software.eessi.io/versions/2025.06/software/linux/aarch64/nvidia/grace/software/pkgconf/2.2.0-GCCcore-13.3.0/lib -o libCgl.la -rpath /cvmfs/software.eessi.io/versions/2025.06/s
oftware/linux/aarch64/nvidia/grace/software/Cgl/0.60.8-foss-2024a/lib -no-undefined -version-info 11:8:10 CglCutGenerator.lo CglMessage.lo CglStored.lo CglParam.lo CglTreeInfo.lo CglAl
lDifferent/libCglAllDifferent.la CglClique/libCglClique.la CglDuplicateRow/libCglDuplicateRow.la CglFlowCover/libCglFlowCover.la CglGMI/libCglGMI.la CglGomory/libCglGomory.la CglKnapsackCover/libCglKnapsackCover.la CglLandP/libCglLandP.la CglLiftAndProject/libCglLiftAndProject.la CglMixedIntegerRounding/libCglMixedIntegerRounding.la CglMixedIntegerRounding2/libCglMixedIntegerRounding2.la CglOddHole/libCglOddHole.la CglPreProcess/libCglPreProcess.la CglProbing/libCglProbing.la CglRedSplit/libCglRedSplit.la CglRedSplit2/libCglRedSplit2.la CglResidualCapacity/libCglResidualCapacity.la CglSimpleRounding/libCglSimpleRounding.la CglTwomir/libCglTwomir.la CglZeroHalf/libCglZeroHalf.la -lOsiClp -lOsi -lClpSolver -lClp -lesmumps -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lmpi_mpifh -lmetis -lscotch -lptscotch -lptscotcherr -lscotcherrexit -lscotcherr -lscalapack -lflexiblas -lgfortran -lCoinUtils -lOsiClp -lClpSolver -lClp -lOsiClp -lClpSolver -lClp -lOsi -lCoinUtils  -lm -lpthread

I don't immediately see any obvious errors in that command.

It's somewhat suspicious that it happened when linking UCX in both cases?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2025.06-software.eessi.io 2025.06 version of software.eessi.io

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants