-
Notifications
You must be signed in to change notification settings - Fork 937
osc/rdma: fix data integrity issue in rdma get_accumulate #10634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
devreal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix 👍 One detail though
| /* result buffer is not necessarily contiguous. use the opal datatype engine to | ||
| * copy the data over in this case */ | ||
| struct iovec iov = {.iov_base = result, .iov_len = request->len}; | ||
| struct iovec iov = {.iov_base = result_start, .iov_len = request->len}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If result != NULL then result_start is NULL (per https://github.com/open-mpi/ompi/pull/10634/files#diff-e4bd30eae59fa8aabf9ee87dcb523aa60a292e7c09d1e84e014c1ae7e13e1e3cR281). result_start should always point to the start of the result buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review, updated the PR with the changes, there was also a memory leak here, https://github.com/open-mpi/ompi/pull/10634/files#diff-e4bd30eae59fa8aabf9ee87dcb523aa60a292e7c09d1e84e014c1ae7e13e1e3cR311 which is fixed.
|
When this PR is complete, will it need to be cherry-picked back to v4.1.x and v4.0.x, too? |
Use start of the result buffer to unpack the data. Signed-off-by: Aboorva Devarajan <abodevar@in.ibm.com>
874a8cb to
02d2f5c
Compare
Jeff, this test passes in |
|
@AboorvaDevarajan Can you please backport this fix to the release branches? |
Signed-off-by: Aboorva Devarajan abodevar@in.ibm.com