Description
I connect to dask-scheduler from my local workstation to submit my computations (using a ssh tunnel).
I have had other people tell me a similar thing (third bullet point of #186 (comment)). If you manage to make that work, please let us know (ideally in a separate issue).
Originally posted by @lesteve in #186 (comment)
This helps circumvent problems where the login node of a cluster has very limited computing power and where you are not allowed to submit jobs from interactive nodes.
The idea is that the login node runs dask-scheduler
and we submit the computations to it from a local workstation. The steps to do this:
- start
ipython
on the login node and start the cluster, e.g.
from dask_jobqueue import PBSCluster
cluster = PBSCluster(…)
cluster.scale(100)
- check the address of the cluster
In [2]: cluster
Out[2]: PBSCluster('tcp://192.168.57.5:43704', workers=100, threads=100, memory=250GB)
- Taking note of the port of the scheduler (43704), now on the local machine setup an ssh tunnel to the cluster (and the dashboard)
ssh -N -L 8786:localhost:43704 -L 8787:localhost:8787 login.cluster
-
You should be able to access the dashboard on the local machine (localhost:8787)
-
Setup the
Client
on your local machine as follows:
from dask.distributed import Client
client = Client("tcp://localhost:8786")
Now you should be able to submit computations to the cluster from your local machine.