Closed
Description
I was checking out head of the v5.0.x branch in high expectations that it would work well on our nvidia + HPE SS11 (aka libfabric) system, but alas, if my application doesn't use cudA, yet is linked against a ompi v5.0.x with all the recent accelerator/cuda changes in place, and configured for CUDA support, things don't work right.
Hello, world, I am 1 of 2, (Open MPI v5.0.0rc9, package: Open MPI hpp@ch-fe1 Distribution, ident: 5.0.0rc9, repo rev: v5.0.0rc9-287-g5d87f3e6, Unreleased developer copy, 141)
Hello, world, I am 0 of 2, (Open MPI v5.0.0rc9, package: Open MPI hpp@ch-fe1 Distribution, ident: 5.0.0rc9, repo rev: v5.0.0rc9-287-g5d87f3e6, Unreleased developer copy, 141)
--------------------------------------------------------------------------
The call to cuEventDestory failed. This is a unrecoverable error and will
cause the program to abort.
cuEventDestory return value: 709
Check the cuda.h file for what the return value means.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
The call to cuEventDestory failed. This is a unrecoverable error and will
cause the program to abort.
cuEventDestory return value: 709
Check the cuda.h file for what the return value means.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
The call to cuEventDestory failed. This is a unrecoverable error and will
cause the program to abort.
cuEventDestory return value: 709
Check the cuda.h file for what the return value means.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
The call to cuEventDestory failed. This is a unrecoverable error and will
cause the program to abort.
cuEventDestory return value: 709
Check the cuda.h file for what the return value means.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
The call to cuEventDestory failed. This is a unrecoverable error and will
cause the program to abort.
cuEventDestory return value: 709
Check the cuda.h file for what the return value means.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
The call to cuEventDestory failed. This is a unrecoverable error and will
cause the program to abort.
cuEventDestory return value: 709
Check the cuda.h file for what the return value means.
It looks like holes may have been plugged for OB1 (if i set the pml to use ob1 I don't see these messages), but such is not the case when using other PMLs apparently.