Skip to content

v4.1.x: libnbc: Fix int overflow when handling the count parameter #9622

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 4, 2021

Conversation

jjhursey
Copy link
Member

@jjhursey jjhursey commented Nov 3, 2021

  • In a reduce_scatter operation if the count array adds up to a
    value greater than INT_MAX then the count passed around is negative
    leading to an invalid buffer bring passed around often resulting in
    a segv crash.
  • The fix is to preserve the true count size as a size_t at all
    levels in the schedule (thus why there is a change to the protocol
    structures).
    • Instead of changing the count parameter of ompi_op_reduce we
      iterate over INT_MAX chunks of the buffer reducing each in turn.

 * In a reduce_scatter operation if the count array adds up to a
   value greater than INT_MAX then the count passed around is negative
   leading to an invalid buffer bring passed around often resulting in
   a segv crash.
 * The fix is to preserve the true count size as a `size_t` at all
   levels in the schedule (thus why there is a change to the protocol
   structures).
   - Instead of changing the count parameter of `ompi_op_reduce` we
     iterate over INT_MAX chunks of the buffer reducing each in turn.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
(cherry picked from commit 6b8e368)
 * If the `count` is greater than `INT_MAX` then we call the
   operation in chunks that fit into an `int`.
 * This moves the functionality out of libnbc and into the common
   reduction operation so that all collectives may pass larger
   counts than `INT_MAX` into the internal reduction operation.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
(cherry picked from commit 6075048)
@jsquyres jsquyres added this to the v4.1.2 milestone Nov 3, 2021
@jsquyres jsquyres changed the title libnbc: Fix int overflow when handling count parameters v4.1.x: libnbc: Fix int overflow when handling count parameters Nov 3, 2021
@bosilca bosilca changed the title v4.1.x: libnbc: Fix int overflow when handling count parameters v4.1.x: libnbc: Fix int overflow when handling the count parameter Nov 3, 2021
@bwbarrett bwbarrett merged commit dc0f4ad into open-mpi:v4.1.x Nov 4, 2021
@jjhursey jjhursey deleted the v41-libnbc-fix-overflow branch November 4, 2021 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants