Description
Problem
The text for MPI_Abort is not clear on whether a call to MPI_Abort
is allowed to return. The text states that:
This routine makes a “best attempt” to abort all MPI processes in the group of comm.
Does that mean that even the calling process is not required to be aborted? Or that some processes not calling MPI_Abort
may survive?
This function does not require that the invoking environment take any action with the error code. However, a Unix or POSIX environment should handle this as a return errorcode from the main program.
Not sure if this text gives more information, because what is "the invoking environment" actually?
Curiously, MPI_Abort
has an integer return type which suggests that it may return. But what is the application supposed to do then? And what may be returned from MPI_Abort
?
As for why this is a bad, consider the following example:
int foo(int x) {
if (x >= 0) {
return do_something_useful(x);
}
// error case, go up in flames
MPI_Abort(MPI_COMM_WORLD);
abort(); // make sure there will be flames
}
The abort()
call is needed both because we don't know whether MPI_Abort
might return and because the compiler will complain about a missing return from the function.
For comparison: C abort
returns void and explicitly never returns. Thus, stdlib.h
has the following definition for abort
:
extern void abort (void) __attribute__ ((__noreturn__));
Proposal
Clearly state whether MPI_Abort
may ever return or not.
Changes to the Text
My preference: add a sentence that states
A call to
MPI_Abort
does not return.
Alternatively, clearly state that MPI_Abort may return, in which cases, and what users are allowed to do then (probably not call any MPI functions anymore).
Impact on Implementations
If we disallow MPI_Abort
to ever return, implementations may annotate the function signature with __attribute__ ((__noreturn__))
if supported. Changing the return type will require changes in implementations.
Impact on Users
Users don't have to care about what happens if MPI_Abort
ever returns. The additional abort()
call in the example is not needed.
References and Pull Requests
Metadata
Metadata
Assignees
Labels
Type
Projects
Status