Document common cluster specific (but not job scheduler specific) quirks with work-arounds if available

I wish there were some list of cluster configuration quirks (that are not job scheduler specific) and possible work-arounds (when there are some) somewhere in the doc (I was not aware of limitation of TCP/IP connection limitations between login and compute nodes in some clusters until a few days ago). Here are a rough list off the top of my head:

* submit_command not available on the compute nodes, e.g. #333. Possible work-around: https://github.com/dask/dask-jobqueue/issues/333#issuecomment-530263090 (I never tried it myself). This is the case for all the OAR clusters I know about, i.e. the submit command is never available on the compute nodes so in principle I could test this idea.
* TCP/IP restrictions between login and compute nodes e.g. #354 and #355. Possible work-around: start the main script / notebook in an interactive node with all the additional pain and limitations this entails, see https://github.com/dask/dask-jobqueue/issues/354#issuecomment-542879534 for the one I know about.
* non uniform network interfaces on login and compute nodes. I guess same work-around as TCP/IP restriction would work but not a great work-around.

Please add more if you know more off the top of your head.

cc @mrocklin @guillaumeeb @jhamman 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document common cluster specific (but not job scheduler specific) quirks with work-arounds if available #356

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Document common cluster specific (but not job scheduler specific) quirks with work-arounds if available #356

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions