Open
Description
We require a page in distributed docs to outline how networking works
- What kind of servers do we have
- How do they speak to each others, e.g. what is a comm, what is a batchedcomm, what is a stream
- When and why do we pool comms and when is a comm long living
- What happens (or should happen) if a comm breaks (e.g. stream comm broken removes a worker while direct comms are usually retried/handled)
- When do we use what kind of comm?
- What kind of RPC protocol are we using?
This list is exhaustive. This issue can be used to refine what the content should look like