-
Notifications
You must be signed in to change notification settings - Fork 935
opal/datatype: fix opal_convertor_raw() #6295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opal/datatype: fix opal_convertor_raw() #6295
Conversation
1a9d1fc to
9aa778f
Compare
correctly handle the case in which iovec is full and the last accessed element of the datatype is the beginning of a loop Refs. open-mpi#6285 Thanks Axel Huebl for reporting this Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
9aa778f to
0832ab5
Compare
|
@edgargabriel FYI, this fixes the hdf5 data corruption reported in #6285 |
|
@ggouaillardet thank you, I will test it later today. I can however confirm that it is inline with my analysis over the weekend, that the values that ompio received from the convertor_raw function looked incorrect. |
|
bot:lanl:retest |
|
@ggouaillardet @edgargabriel thank you a lot for finding the root source of this and the PR! |
|
@ax3l if I remember correctly, the convertor_raw function is also used in the osc/rdma component as well |
|
bot:ompi:retest |
|
Looks fine to me. Let's go ahead and bring this into master. Please PR to the affected branches and tag @bosilca to review. |
|
For the sake of completeness, the solution proposed here is not really working on all cases. A better approach has been proposed in #6326 . |
|
@bosilca Great. Will test with that. Trying to get ARMCI fully tested with osc/rdma and this helped it pass. |
|
The underlying issue can be replicated with a simple vector, as long as the iov_count is smaller than the count of the vector. Moreover, now that I understood the issue I think we might have the same type of problems in the pack/unpack, which is what #6172 is trying to address. Will try to take a look tomorrow. |
correctly handle the case in which iovec is full and the last accessed element of the datatype is the beginning of a loop Refs. open-mpi#6285 Thanks Axel Huebl for reporting this Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> cherry-pick open-mpi#6295 into 3.1 Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
correctly handle the case in which iovec is full and the last accessed element of the datatype is the beginning of a loop Refs. open-mpi#6285 Thanks Axel Huebl for reporting this Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> cherry-pick open-mpi#6295 into 3.1 Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
No description provided.