Skip to content

Connect to dask-scheduler from local workstation (maybe update documentation?) #377

Open
@manuel-rhdt

Description

@manuel-rhdt

I connect to dask-scheduler from my local workstation to submit my computations (using a ssh tunnel).

I have had other people tell me a similar thing (third bullet point of #186 (comment)). If you manage to make that work, please let us know (ideally in a separate issue).

Originally posted by @lesteve in #186 (comment)

This helps circumvent problems where the login node of a cluster has very limited computing power and where you are not allowed to submit jobs from interactive nodes.

The idea is that the login node runs dask-scheduler and we submit the computations to it from a local workstation. The steps to do this:

  1. start ipython on the login node and start the cluster, e.g.
from dask_jobqueue import PBSCluster 
cluster = PBSCluster(…)
cluster.scale(100)  
  1. check the address of the cluster
In [2]: cluster                                                 
Out[2]: PBSCluster('tcp://192.168.57.5:43704', workers=100, threads=100, memory=250GB)
  1. Taking note of the port of the scheduler (43704), now on the local machine setup an ssh tunnel to the cluster (and the dashboard)
ssh -N -L 8786:localhost:43704 -L 8787:localhost:8787 login.cluster
  1. You should be able to access the dashboard on the local machine (localhost:8787)

  2. Setup the Client on your local machine as follows:

from dask.distributed import Client
client = Client("tcp://localhost:8786")

Now you should be able to submit computations to the cluster from your local machine.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions