-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use NEVER_ALONE
for chained MPI calls
#184
Conversation
@@ -201,61 +202,14 @@ TEST_CASE_METHOD(RemoteMpiTestFixture, | |||
thisWorld.destroy(); | |||
} | |||
|
|||
TEST_CASE_METHOD(RemoteMpiTestFixture, "Test barrier across hosts", "[mpi]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the new scheduling policy, we will always have at least two remote processes. Thus, we can't test a barrier across hosts without using a distributed setting (both remote ranks must call barrier
to unlock).
The other tests in this file I managed to port.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok nice, we have quite a lot of this kind of test where we're trying to fake a distributed setting locally, all of which date from before the distributed tests.
Is there a dist test that covers this now? I can't remember. If not, could we add one? Happy to discuss specifics of this offline as I'm not sure we have any Faabric dist tests that cover the MPI implementation.
In future it might be worth trying to port things to dist tests rather than keep the hacky local versions, but in this instance it seems to work ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am introducing distributed tests for MPI (inheriting from mpi-native
in #186). The order in which we merge this PR and the other one does not really matter.
@@ -201,61 +202,14 @@ TEST_CASE_METHOD(RemoteMpiTestFixture, | |||
thisWorld.destroy(); | |||
} | |||
|
|||
TEST_CASE_METHOD(RemoteMpiTestFixture, "Test barrier across hosts", "[mpi]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok nice, we have quite a lot of this kind of test where we're trying to fake a distributed setting locally, all of which date from before the distributed tests.
Is there a dist test that covers this now? I can't remember. If not, could we add one? Happy to discuss specifics of this offline as I'm not sure we have any Faabric dist tests that cover the MPI implementation.
In future it might be worth trying to port things to dist tests rather than keep the hacky local versions, but in this instance it seems to work ok.
In this PR I set MPI functions to use the
NEVER_ALONE
scheduling topology hint by default.Preliminary results show that this policy reduces the number of cross-host messages by 20% on our baseline experiment.
I have to update some tests in the
test_remote_mpi_worlds.cpp
file as they assumed a specific scheduling behaviour.