Skip to content

btl/smcuda: Add atomic_wmb() before sm_fifo_write #12338

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 15, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions opal/mca/btl/smcuda/btl_smcuda_fifo.h
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ static void add_pending(struct mca_btl_base_endpoint_t *ep, void *data, bool res
#define MCA_BTL_SMCUDA_FIFO_WRITE(endpoint_peer, my_smp_rank, peer_smp_rank, hdr, resend, \
retry_pending_sends, rc) \
do { \
/* memory barrier: ensure writes to the hdr have completed */ \
opal_atomic_wmb(); \
Comment on lines +88 to +89
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a difference between barrier before the write vs after?

Typically I see the barrier after the actual write.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I debated myself as well.

MCA_BTL_SMCUDA_FIFO_WRITE is called from several places. Each of these would need an update to include a write barrier, and any new call would need to include that as well. For this reason I felt it best to embed the barrier into the fifo_write as a "make sure hdr is committed" sort of step rather than relying on all functions which fill the header.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The write in question here is not the integration of the header into the list but the writes related to the content of the header. This look good to me.

Looking at the linked code it appears that the current firo_write also has a write barrier after the item integration. I can't figure out why we need that one.

sm_fifo_t *_fifo = &(mca_btl_smcuda_component.fifo[peer_smp_rank][FIFO_MAP(my_smp_rank)]);\
\
if (retry_pending_sends) { \
Expand Down