Open
Description
Thank you for taking the time to submit an issue!
Background information
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
4.0.5
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
spack install openmpi@4.0.5 +cuda +cxx +legacylaunchers +lustre fabrics=cma,knem,ucx schedulers=slurm
If you are building/installing from a git clone, please copy-n-paste the output from git submodule status
.
Please describe the system on which you are running
- Operating system/version: RHEL 7.6 ppc64le
- Computer hardware: IBM AC922 (like Summit/Sierra)
- Network type: EDR IB
Details of the problem
IMB-RMA crashes at the start like this. A similar build on x86_64 runs, as does spectrum-mpi 10.3 on this system.
Is this known to work on Summit?
# Truly_passive_put
# The benchmark measures execution time of MPI_Put for 2 cases:
# 1) The target is waiting in MPI_Barrier call (t_pure value)
# 2) The target performs computation and then enters MPI_Barrier routine (t_ovrl value)
[gpu027:91080:0:91080] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x8)
==== backtrace (tid: 91080) ====
=================================
[gpu027:91080] *** Process received signal ***
[gpu027:91080] Signal: Segmentation fault (11)
[gpu027:91080] Signal code: (-6)
[gpu027:91080] Failing at address: 0x262292ca000163c8
[gpu027:91080] [ 0] [0x2000000504d8]
[gpu027:91080] [ 1] /users/***/spack/opt/spack/linux-rhel7-power9le/gcc-8.4.0/openmpi-4.0.5-6sqv24vyrwc5nerb7y5fslqnf5jrnjv6/lib/openmpi/mca_osc_rdma.so(ompi_osc_rdma_lock_atomic+0x94)[0x200014d19fb4]
[gpu027:91080] [ 2] /users/***/spack/opt/spack/linux-rhel7-power9le/gcc-8.4.0/openmpi-4.0.5-6sqv24vyrwc5nerb7y5fslqnf5jrnjv6/lib/libmpi.so.40(MPI_Win_lock+0x138)[0x2000001835a8]
[gpu027:91080] [ 3] IMB-RMA(IMB_rma_single_put+0x17c)[0x100d3d18]
[gpu027:91080] [ 4] IMB-RMA(_ZN11Bmark_descr21IMB_init_buffers_iterEP9comm_infoP13iter_scheduleP5BenchP5cmodeii+0xce0)[0x100a9ac8]
[gpu027:91080] [ 5] IMB-RMA(_ZN17OriginalBenchmarkI14BenchmarkSuiteIL17benchmark_suite_t3EEXadL_Z18IMB_rma_single_putEEE3runERK10scope_item+0x398)[0x100ab1a0]
[gpu027:91080] [ 6] IMB-RMA(main+0x19b0)[0x10060aa4]
[gpu027:91080] [ 7] /lib64/libc.so.6(+0x25200)[0x200000645200]
[gpu027:91080] [ 8] /lib64/libc.so.6(__libc_start_main+0xc4)[0x2000006453f4]
[gpu027:91080] *** End of error message ***
--------------------------------------------------------------------------
Metadata
Metadata
Assignees
Labels
No labels