Patrick Flick notes:
For v-collectives I've been able to use the regular MPI_Alltoallw (and avoid the neighborhood collectives). I wanted to share my approach and ask you for your opinion. In my experience this works for current versions of MPICH and OpenMPI.
As you mentioned in your paper, the MPI_Alltoallw takes integer offsets, and thus can't be used for sending larger than INT_MAX datatypes. However, what works for me is to wrap each datatype into a MPI_Type_create_struct with a single element (the type_contiguous) and specifying the required offset as the displacement (which is a MPI_Aint). Then the MPI_Alltoallw can be called with offset = 0 for all processes.
This is a great idea. It should be implemented in BigMPI.
Patrick Flick notes:
This is a great idea. It should be implemented in BigMPI.