Skip to content

Conversation

@csegarragonz
Copy link
Collaborator

No description provided.

csegarragonz added a commit that referenced this pull request Mar 1, 2024
@csegarragonz csegarragonz changed the title transport: shortcut need to order messages on reception transport(ptp): shortcut need to order messages on reception Mar 1, 2024
csegarragonz added a commit that referenced this pull request Mar 13, 2024
csegarragonz added a commit that referenced this pull request Mar 25, 2024
In this PR we bypass the usage of the PTP layer for data-plane (i.e.
message-passing) operations in our MPI implementation. The goal is to
increase throughput and latency, particularly of small MPI messages, by
giving each MPI rank a (thread-local) receiver thread and a separate
port.

Note that this approach, unlike our original PTP-less MPI transport
layer, only requires one additional port per physical core, irrespective
of the size of applications, or number of concurrent MPI applications.

For the moment, this is very MPI specific, but in the future we could
use a similar per-Faaslet recv-socket approach whenever we need
low-latency and high-throughput.

Colaterally this fixes all the (re-)ordering issues in the PTP layer
that were forcing us to run faabric with a very high number of PTP
server threads.

Closes #335
Closes #389
csegarragonz added a commit that referenced this pull request Mar 25, 2024
In this PR we bypass the usage of the PTP layer for data-plane (i.e.
message-passing) operations in our MPI implementation. The goal is to
increase throughput and latency, particularly of small MPI messages, by
giving each MPI rank a (thread-local) receiver thread and a separate
port.

Note that this approach, unlike our original PTP-less MPI transport
layer, only requires one additional port per physical core, irrespective
of the size of applications, or number of concurrent MPI applications.

For the moment, this is very MPI specific, but in the future we could
use a similar per-Faaslet recv-socket approach whenever we need
low-latency and high-throughput.

Colaterally this fixes all the (re-)ordering issues in the PTP layer
that were forcing us to run faabric with a very high number of PTP
server threads.

Closes #335
Closes #389
csegarragonz added a commit that referenced this pull request Mar 25, 2024
In this PR we bypass the usage of the PTP layer for data-plane (i.e.
message-passing) operations in our MPI implementation. The goal is to
increase throughput and latency, particularly of small MPI messages, by
giving each MPI rank a (thread-local) receiver thread and a separate
port.

Note that this approach, unlike our original PTP-less MPI transport
layer, only requires one additional port per physical core, irrespective
of the size of applications, or number of concurrent MPI applications.

For the moment, this is very MPI specific, but in the future we could
use a similar per-Faaslet recv-socket approach whenever we need
low-latency and high-throughput.

Colaterally this fixes all the (re-)ordering issues in the PTP layer
that were forcing us to run faabric with a very high number of PTP
server threads.

Closes #335
Closes #389
csegarragonz added a commit that referenced this pull request Mar 25, 2024
In this PR we bypass the usage of the PTP layer for data-plane (i.e.
message-passing) operations in our MPI implementation. The goal is to
increase throughput and latency, particularly of small MPI messages, by
giving each MPI rank a (thread-local) receiver thread and a separate
port.

Note that this approach, unlike our original PTP-less MPI transport
layer, only requires one additional port per physical core, irrespective
of the size of applications, or number of concurrent MPI applications.

For the moment, this is very MPI specific, but in the future we could
use a similar per-Faaslet recv-socket approach whenever we need
low-latency and high-throughput.

Colaterally this fixes all the (re-)ordering issues in the PTP layer
that were forcing us to run faabric with a very high number of PTP
server threads.

Closes #335
Closes #389
@csegarragonz csegarragonz deleted the no-order branch March 26, 2024 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant