Skip to content

Conversation

@MamziB
Copy link
Contributor

@MamziB MamziB commented Jul 21, 2022

OSC/UCX: Adding locks to win attach/detach and fixing build warnings

Signed-off-by: Mamzi Bayatpour mbayatpour@nvidia.com

int insert_index = -1, contain_index;
int ret = OMPI_SUCCESS;

ompi_osc_ucx_lock(MPI_LOCK_EXCLUSIVE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we take a shared lock in get_dynamic_win_info to make sure readers get consistent segments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devreal I think the process that is calling get_dynmic_win_info should have already had the lock on target. That's why I did not want to lock it again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. But this won't work: the process that attaches the segment can hold a shared lock on the window, so acquiring an exclusive lock will deadlock. The osc/rdma component has two different locks for that reason

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devreal Alright, I'll make the necessary changes. Thanks for the feedback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devreal can you let me know your thoughts about the second commit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that looks good. One last thing: it's now not just the atomicity lock anymore. So should we rename it to state lock? That might avoid some confusion for future readers...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback, I really appreciate it. Yes, I just modified it. Please take a look at the single commit.

int insert = -1;
int ret;

bool lock_needed = ompi_osc_need_acc_lock(module, ompi_comm_rank(module->comm));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't that be ompi_osc_need_acc_lock(module, target);? And same below for the lock call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, good catch. It should be locking the target window. I will make the change.


bool lock_needed = ompi_osc_need_acc_lock(module, ompi_comm_rank(module->comm));
if (lock_needed) {
ompi_osc_ucx_lock(MPI_LOCK_SHARED, ompi_comm_rank(module->comm), 0, win);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right lock? It seems like this should be the accumulate lock, like in start_atomicity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since inside the win attach, we are locking OSC_UCX_STATE_LOCK_OFFSET offset, I think we need to try to get the same lock from the process that is calling get_dynamic_info. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just wondering: does ompi_osc_ucx_lock take the same lock as MPI_Win_lock? The name sounds like it... In that case we'd still run into a deadlock if the calling process already owns an MPI-level locks on the window.

Copy link
Contributor Author

@MamziB MamziB Jul 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, when I fixed the typo of the value of the target lock on get_dynamic_info, now I get an error complaining that I am trying to get a lock that was already got acquired (MPI_ERR_RMA_SYNC). It seems like MPI_Win_lock ends up locking the same lock as ompi_osc_ucx_lock. In this case, we need to use ACC lock for both window-attach and get_dynamic_info.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devreal I made the above changes. Please take a look.

@MamziB MamziB force-pushed the mamzi/dynamic-win-fixes-2 branch from 2f0ec89 to d9becec Compare July 25, 2022 21:44
Copy link
Contributor

@devreal devreal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting there :)

}

static inline int get_dynamic_win_info(uint64_t remote_addr, ompi_osc_ucx_module_t *module,
static inline int get_dynamic_win_info(uint64_t remote_addr, struct ompi_win_t *win,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't seem to use win anymore, maybe revert back to module here and in CHECK_DYNAMIC_WIN?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sounds good. Let me add a new commit.

}

CHECK_DYNAMIC_WIN(remote_addr, module, target, ret);
CHECK_DYNAMIC_WIN(remote_addr, module, target, ret, win);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is still an issue: if !module->acc_single_intrinsic we take the state lock and then would take the state lock again in get_dynamic_win_info. Maybe check in get_dynamic_win_info if the state lock was taken (by passing lock_acquired from above)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Yes, we need to check that. I'll add this.

Copy link
Contributor

@devreal devreal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment. Please squash and then it's ready to go 👍

base, len, insert);
}
}
extern bool ompi_osc_need_acc_lock(ompi_osc_ucx_module_t *module, int target)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for extern here and on the functions below since it has been declared extern inline already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not ready. @devreal Actually, looks like the new change (using ACC lock, instead of the main application lock) leads to hang at a large scale. The reason is that we must not perform window attach, while other processes are accessing this dynamic window. Currently, ACC lock only remains alive during the get_dynamic_window_info() which is not enough. We must lock ACC-lock during entire RMA operation. Working on this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just thinking out loud. Since we only need lock to read the window info (not the data inside the window), Ideally, we should not need to have the acc-lock during entire rma operation, but somehow, using this extra acc lock instead of app lock is leading to race condition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that we must not perform window attach, while other processes are accessing this dynamic window.

I'm not sure I understand the issue. Why not? We shouldn't keep the acc longer than we need to, i.e., we should release it as soon as we have queried the attached segments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to investigate further to know exactly what's wrong with the current state of the patch that leads to hang. I don't get the hang if I omit the today's changes (adding a c-lock). I will update you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey, @devreal I fixed the hang issue, please let me know if you have any more comments. Thanks.

@devreal
Copy link
Contributor

devreal commented Jul 26, 2022

Looks good to me now, please squash and I will approve

@MamziB MamziB force-pushed the mamzi/dynamic-win-fixes-2 branch 2 times, most recently from 6496081 to 723f17a Compare July 26, 2022 20:54
@MamziB
Copy link
Contributor Author

MamziB commented Jul 26, 2022

@devreal Thanks for the feedback. The commit is ready.

Signed-off-by: Mamzi Bayatpour  <mbayatpour@nvidia.com>
@MamziB MamziB force-pushed the mamzi/dynamic-win-fixes-2 branch from 723f17a to 56afecd Compare July 26, 2022 20:55
@awlauria awlauria merged commit 5c302ac into open-mpi:main Aug 2, 2022
@awlauria
Copy link
Contributor

awlauria commented Aug 2, 2022

@MamziB please cherry-pick to v5.0.x. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants