GTSfM uses the SSHCluster module of Dask to provide cluster-utilization functionality for SfM execution. This readme is a step-by-step guide on how to set up your machines for a successful run on a cluster.
-
Choose which machine will serve as the scheduler. The data only needs to be on the scheduler node.
-
Create a config file listing the IP addresses of cluster machines (example in gtsfm/configs/cluster.yaml).
- Note that the first worker in the cluster.yaml file must be the scheduler machine where the data is hosted.
-
Enable passwordless SSH between all the workers (machines) on the cluster.
- Log in individually to each machine listed in the cluster config file.
- For each of the other machines on the cluster, run:
-
ssh-copy-id {username}@{machine_ip_address_of_another_worker}
- If you see
/usr/bin/ssh-copy-id: ERROR: No identities found
, then runssh-keygen -t rsa
first. - Repeat the two steps above on all machines.
-
- Note machines should be able to ssh into themselves passwordless e.g. host1 should be able to ssh into host1.
- If the cluster has 5 machines, then
ssh-copy-id
must be run 5*5=25 times.
-
Clone gtsfm and follow the main readme file to setup the environment on all nodes in the cluster at an identical path
-
git clone --recursive https://github.com/borglab/gtsfm.git conda env create -f environment_linux.yml conda activate gtsfm-v1
-
-
Log into scheduler again and download the data to scheduler machine.
-
Run gtsfm with
-–cluster_config
flag enabled, for example-
python /home/username/gtsfm/gtsfm/runner run_scene_optimizer_colmaploader.py --images_dir /home/username/gtsfm/skydio-32/images/ --config_name sift_front_end.yaml --colmap_files_dirpath /home/hstepanyan3/gtsfm/skydio-32/colmap_crane_mast_32imgs/ --cluster_config cluster.yaml
- Always provide absolute paths for all directories
-
-
If you would like to check out the dask dashboard, you will need to do port forwarding from machine to your local computer:
-
ssh -N -f -L localhost:local_port:localhost:machine_port username@machine_adress
-
-
The results will be generated on the scheduler machine. If you would like to download results from the scheduler machine to your local computer:
-
scp -r username@host:machine/results/path /local/computer/directory
-