-
Notifications
You must be signed in to change notification settings - Fork 903
OSC/UCX: preserve the accumulate ordering for overlapping buffers during acc-lock less epocs and setting a proper wpool context mutex type #11178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this work @MamziB! Just a few questions/hints in the comments.
acc-lock less epocs Signed-off-by: Mamzi Bayatpour <mbayatpour@nvidia.com> Co-authored-by: Tomislav Janjusic <tomislavj@nvidia.com>
5c3059c
to
e3c3391
Compare
@devreal Thanks for the comments. Please take a look at the updated commit. |
that this object has been already getting constructed using opal_recursive_mutex_t inside opal_common_ucx_wpctx_create, but inside opal_common_ucx_ctx_t, it was missing the proper type. Signed-off-by: Mamzi Bayatpour <mbayatpour@nvidia.com> Co-authored-by: Tomislav Janjusic <tomislavj@nvidia.com>
|
||
if (!op_added) { | ||
/* no more space so flush */ | ||
ret = opal_common_ucx_ctx_flush(module->ctx, OPAL_COMMON_UCX_SCOPE_WORKER, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to hold the ctx mutex during the flush?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the ctx mutex is recursive, so it should be ok. it does not hang.
Thanks, it good now. Two more thoughts though:
Those are optimizations though and we could do them as a follow-up. I'll approve what we have now 👍 |
@devreal I'll open an issue to track the suggestions as a follow up - thanks for the comments. |
@MamziB v5.0.x backport please, when you can. |
@devreal Thanks for the suggestions. My comments are as follows:
|
OSC/UCX: preserve the accumulate ordering for overlapping buffers during acc-lock less epocs and setting a proper wpool context mutex type
Signed-off-by: Mamzi Bayatpour mbayatpour@nvidia.com
Co-authored-by: Tomislav Janjusic tomislavj@nvidia.com