Closed
Description
@xinzhao3 @jladd-mlnx
many IBM tests on v3.1.x and on master have been failing for a number of weeks with a runtime segv due to the OSC UCX component.
I believe this should be easy to reproduce, though I'm not sure where the argument to the 'flavor' is coming from.
I think we should either block v3.1.x or disable the ucx osc component for the v3.1.x until we figure this out, due to how easy it is to his this issue.
aint: osc_ucx_component.c:246: int mem_map(void **, size_t, ucp_mem_h *, ompi_osc_ucx_module_t *,
int): Assertion `flavor == 2 || flavor == 1' failed.
[c656f6n05:122836] *** Process received signal ***
[c656f6n05:122836] Signal: Aborted (6)
[c656f6n05:122836] Signal code: (-6)
[c656f6n05:122836] [ 0] [0x3fff9fcd0478]
[c656f6n05:122836] [ 1] aint: osc_ucx_component.c:246: int mem_map(void **, size_t, ucp_mem_h *,
ompi_osc_ucx_module_t *, int): Assertion `flavor == 2 || flavor == 1' failed.
[c656f6n05:122835] *** Process received signal ***
[c656f6n05:122835] Signal: Aborted (6)
[c656f6n05:122835] Signal code: (-6)
[c656f6n05:122835] [ 0] [0x3fffa46f0478]
[c656f6n05:122835] [ 1] /lib64/libc.so.6(abort+0x280)[0x3fff9f530d70]
[c656f6n05:122836] [ 2] /lib64/libc.so.6(abort+0x280)[0x3fffa3f50d70]
[c656f6n05:122835] [ 2] /lib64/libc.so.6(+0x348a4)[0x3fff9f5248a4]
[c656f6n05:122836] [ 3] /lib64/libc.so.6(+0x348a4)[0x3fffa3f448a4]
[c656f6n05:122835] [ 3] /lib64/libc.so.6(__assert_fail+0x64)[0x3fff9f524994]
[c656f6n05:122836] [ 4] /lib64/libc.so.6(__assert_fail+0x64)[0x3fffa3f44994