Description
As reported by @PHHargrove http://www.open-mpi.org/community/lists/devel/2016/05/18928.php and http://www.open-mpi.org/community/lists/devel/2016/05/18930.php, and by @bosilca http://www.open-mpi.org/community/lists/devel/2016/05/18929.php, there appear to be new segv's in master -- potentially caused by patcher...?
Here's one stack trace posted by @PHHargrove on LITTLE-ENDIAN Power8:
Program terminated with signal SIGSEGV, Segmentation fault.
(gdb) where
#0 0x0000000000000000 in ?? ()
#1 0x00003fff897adb38 in intercept_munmap (start=0x3fff89670000, length=65536)
at /home/phargrov/OMPI/openmpi-v2.x-dev-1410-g81e0924-linux-ppc64el-xlc/openmpi-gitclone/opal/mca/memory/patcher/memory_patcher_component.c:155
#2 0x00003fff8933bc80 in __GI__IO_setb () from /lib64/libc.so.6
#3 0x00003fff89339528 in __GI__IO_file_close_it () from /lib64/libc.so.6
#4 0x00003fff89327f74 in fclose@@GLIBC_2.17 () from /lib64/libc.so.6
#5 0x0000000010000f7c in do_test ()
at /home/phargrov/OMPI/openmpi-v2.x-dev-1410-g81e0924-linux-ppc64el-xlc/openmpi-gitclone/ompi/debuggers/dlopen_test.c:97
#6 0x00000000100010e0 in main (argc=1, argv=0x3fffff332888)
at /home/phargrov/OMPI/openmpi-v2.x-dev-1410-g81e0924-linux-ppc64el-xlc/openmpi-gitclone/ompi/debuggers/dlopen_test.c:135
"start" is valid:
(gdb) print *(char*)0x3fff89670000
$1 = 35 '#'
Frame 1:
155 opal_mem_hooks_release_hook (start, length, true);
Here's another:
BIG-endian PPC64 w/ xlc V13.1 experiences a nearly identical failure.
However, this time gdb appears to have been able to resolve frame #0 to a PLT slot (instead of "??").
#0 0x00000fff8904ef88 in 00000010.plt_call.opal_mem_hooks_release_hook+0 ()
from /gpfs-biou/phh1/OMPI/openmpi-v2.x-dev-1410-g81e0924-linux-ppc64-xlc-13.1/INST/lib/libopen-pal.so.20
#1 0x00000fff8910b630 in intercept_munmap (start=0xfff88d20000, length=2097152)
at /gpfs-biou/phh1/OMPI/openmpi-v2.x-dev-1410-g81e0924-linux-ppc64-xlc-13.1/openmpi-gitclone/opal/mca/memory/patcher/memory_patcher_component.c:155
#2 0x000000800cc5ca80 in ._IO_setb () from /lib64/libc.so.6
#3 0x000000800cc5b16c in ._IO_file_close_it () from /lib64/libc.so.6
#4 0x000000800cc4a758 in .fclose () from /lib64/libc.so.6
#5 0x0000000010000f88 in do_test ()
at /gpfs-biou/phh1/OMPI/openmpi-v2.x-dev-1410-g81e0924-linux-ppc64-xlc-13.1/openmpi-gitclone/ompi/debuggers/dlopen_test.c:97
#6 0x00000000100010d8 in main (argc=1, argv=0xffff462f398)
at /gpfs-biou/phh1/OMPI/openmpi-v2.x-dev-1410-g81e0924-linux-ppc64-xlc-13.1/openmpi-gitclone/ompi/debuggers/dlopen_test.c:135
@hjelmn Please investigate.