You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm creating a stable ginkgo-hpc package for Arch Linux and I'm getting some issues. Besides #1564, #1566 and #1143, there are some tests that fail with the following error:
281/285 Test #283: benchmark_multi_vector_distributed .......................***Failed 1.27 sec
TEST: '/usr/bin/mpiexec' '-n' '3' '/build/ginkgo-hpc/src/build/benchmark/blas/distributed/multi_vector_distributed' '-input' '[{"n": 100}]'
FAIL: stderr differs
---
+++
@@ -1,3 +1,6 @@
+[arch-nspawn-268570:99043] No HIP capabale device found. Disabling component.
+[arch-nspawn-268570:99045] No HIP capabale device found. Disabling component.
+[arch-nspawn-268570:99044] No HIP capabale device found. Disabling component.
This is Ginkgo 1.7.0 (master)
running with core module 1.7.0 (master)
Running on reference(0)
282/285 Test #284: benchmark_spmv_distributed ...............................***Failed 1.27 sec
TEST: '/usr/bin/mpiexec' '-n' '3' '/build/ginkgo-hpc/src/build/benchmark/spmv/distributed/spmv_distributed' '-input' '[{"size": 100, "stencil": "7pt", "comm_pattern": "stencil"}]'
FAIL: stderr differs
---
+++
@@ -1,3 +1,6 @@
+[arch-nspawn-268570:99066] No HIP capabale device found. Disabling component.
+[arch-nspawn-268570:99065] No HIP capabale device found. Disabling component.
+[arch-nspawn-268570:99064] No HIP capabale device found. Disabling component.
This is Ginkgo 1.7.0 (master)
running with core module 1.7.0 (master)
Running on reference(0)
283/285 Test #285: benchmark_solver_distributed .............................***Failed 1.21 sec
TEST: '/build/ginkgo-hpc/src/build/benchmark/solver/distributed/solver_distributed' '-input' '[{"size": 100, "stencil": "7pt", "comm_pattern": "stencil", "optimal": {"spmv": "csr-csr"}}]'
FAIL: stderr differs
---
+++
@@ -1,3 +1,4 @@
+[arch-nspawn-268570:99060] No HIP capabale device found. Disabling component.
This is Ginkgo 1.7.0 (master)
running with core module 1.7.0 (master)
Running on reference(0)
The build system has no GPU, but ROCm/HIP is installed for building the -hip variant of the package. But these tests are built with -DGINKGO_BUILD_HIP=OFF (I know it is pointless to run HIP tests without a GPU).
Arch Linux has ROCm-aware OpenMPI 5.0 and it is responsible for printing the No HIP capabale device found. Disabling component. message from each rank. Hence, if you compare the output of a serial test with that run through mpirun, there will necessarily be a difference. The tests should be designed better, assuming that the MPI library itself does not print anything is rather naive.
The text was updated successfully, but these errors were encountered:
I would suggest disabling the corresponding tests using ctest -E benchmark_.*_distributed in the short term, changing this behavior would require some refactoring of the benchmark code that we can't prioritize immediately. The benchmarks are not designed for easy testability, the tests were added after the fact to enable some refactoring, so they are mainly intended for us developers.
Hi,
I'm creating a stable ginkgo-hpc package for Arch Linux and I'm getting some issues. Besides #1564, #1566 and #1143, there are some tests that fail with the following error:
The build system has no GPU, but ROCm/HIP is installed for building the
-hip
variant of the package. But these tests are built with-DGINKGO_BUILD_HIP=OFF
(I know it is pointless to run HIP tests without a GPU).Arch Linux has ROCm-aware OpenMPI 5.0 and it is responsible for printing the
No HIP capabale device found. Disabling component.
message from each rank. Hence, if you compare the output of a serial test with that run throughmpirun
, there will necessarily be a difference. The tests should be designed better, assuming that the MPI library itself does not print anything is rather naive.The text was updated successfully, but these errors were encountered: