Skip to content

Errata on MPI_INTERCOMM_CREATE - the tag argument lost its description #750

Open
@RolfRabenseifner

Description

@RolfRabenseifner

Problem

From MPI-1.1 to MPI-2.2 the tag argument had a understandable description:

MPI_INTERCOMM_CREATE(local_comm, local_leader, peer_comm, remote_leader, tag,
newintercomm)
IN / local_comm / local intra-communicator (handle)
IN / local_leader / rank of local group leader in local_comm (integer)
IN / peer_comm / “peer” communicator; significant only at the local_leader (handle)
IN / remote_leader / rank of remote group leader in peer_comm; significant only at the local_leader (integer)
IN / tag / “safe” tag (integer)
OUT / newintercomm / new inter-communicator (handle
...
This call creates an inter-communicator. It is collective over the union of the local and
remote groups. Processes should provide identical local_comm and local_leader arguments
within each group. Wildcards are not permitted for remote_leader, local_leader, and tag.

This call uses point-to-point communication with communicator
peer_comm, and with tag tag between the leaders. Thus, care must be taken that there be
no pending communication on peer_comm that could interfere with this communication.

Advice to users. We recommend using a dedicated peer communicator, such as a
duplicate of MPI_COMM_WORLD, to avoid trouble with peer communicators. (End
of advice to users.)

For MPI-3.0, the MPI forum decided to remove the restriction

Thus, care must be taken that there be no pending communication on peer_comm that could interfere with this communication.

The goal was that the tags for point-to-point communication and for intercomm-creation should no longer interfer.

Starting with MPI-3.0 the technical implementation of this decision was that the whole last paragraph plus Advice to users was completely removed. Therefore, the whole procedure description is in MPI-3.0 until MPI-4.1 only:

This call uses point-to-point communication with communicator
peer_comm, and with tag tag between the leaders. Thus, care must be taken that there be
no pending communication on peer_comm that could interfere with this communication.

This means, there is no description about the tag:

  • It can no longer be retrieved for all the text and matching rules for point-to-point communication.
  • There is no sentence that they must be identical at all processes or at least at both local leaders.
  • There is no sentence, which allows to decide whether in the provided ring example (MPI-4.0, 7.6.3 Inter-Communication Examples, Example 2 on page 363-364) requires different tags or can be implemented with the same tag for all three inter-communicators (I expect that this was allowed based on the text in MPI-1.1 to MPI-2.2, because the matching rule would differentiate the internal messages based on different source/dest combinations).
  • There is no sentence about what happens if multi-threaded processes may concurrently create two or more inter-communicators.

And I additionally cannot understand why the tag description

IN / tag / “safe” tag (integer)

does not say

IN / tag / “safe” tag; significant only at the local_leader (integer)

The examples listed at hotexamples were not really useful for me:
https://cpp.hotexamples.com/examples/-/-/MPI_Intercomm_create/cpp-mpi_intercomm_create-function-examples.html

Proposal

A) Because the calls are collective over the union of local and remote groups, the tags are not needed for matching between the processes within the local or remote group.
Therefore the use of the tag can be restricted the leaders of the groups:

IN / tag / “safe” tag; significant only at the local_leader (integer)

The rule can be:

The tags provided by the local and remote leaders must be identical. In the case of two concurrent invocation (e.g., on several threads) with same peer_comm and same leaders within the peer_comm, different tags must be used in both invocations.

B) I expect that A) was the original intention, because in MPI-1.1 until MPI-2.2, the tag was only usable on the peer_comm. But now, the main use is for concurrent calls and therefore, we should have it as in all the related APIs:
MPI_COMM_CREATE_GROUP, MPI_COMM_CREATE_FROM_GROUP, and MPI_INTERCOMM_CREATE_FROM_GROUPS

Therefore, I propose to only add:

All MPI processes of the union of the local and remote groups must provide an identical \mpiarg{tag} value;
it differentiates concurrent calls in a multithreaded environment.

Changes to the Text

See pull request PR 878

Impact on Implementations

None, because this describes what was intended by the change in MPI-3.0.

Impact on Users

None.

References and Pull Requests

https://github.com/mpi-forum/mpi-standard/pull/878

Metadata

Metadata

Labels

chap-contextsGroups, Contexts, Communicators, Caching Chapter CommitteeerrataErrata items for the previous MPI Standardmpi-6For inclusion in the MPI 5.1 or 6.0 standardno-wgDiscussion doesn't have a current working group

Type

No type

Projects

Status

To Do

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions