Skip to content
forked from bsc-wdc/dislib

The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.

License

Notifications You must be signed in to change notification settings

tirkarthi/dislib

Repository files navigation


Barcelona Supercomputing Center

The Distributed Computing Library

Distributed Computing library implemented using PyCOMPSs programming model for HPC.

   Documentation Status GitHub version Build Status Code Coverage

WebsiteDocumentationReleases

Introduction

The Distributed Computing library is a project which aims to provide distributed machine learning algorithms ready to use as a library. Is is developed on top of PyCOMPSs programming model and is being developed by the Workflows and Distributed Computing group of the Barcelona Supercomputing Center. The library is designed to allow easy local development through docker. Once the code is finished, it can be run directly into a supercomputer / cloud without any further changes. For more information on which supercomputers and architectures are supported refer to Availability

Contents

Quickstart

Folow the steps below to get started with wdc Dislib.

1. Install Docker and docker-py

Warning: requires docker version >= 17.12.0-ce

  1. Follow these instructions

Be aware that for some distros the docker package has been renamed from docker to docker-ce. Make sure you install the new package.

  1. Add user to docker group to run dislib as non-root user.

  2. Check that docker is correctly installed

docker --version
docker ps # this should be empty as no docker processes are yet running.
  1. Install docker-py
sudo pip3 install docker

2. Install the dislib

Download the latest release.

Extract the tar file from your terminal:

tar -zxvf dislib_v0.1.0.tar.gz

Move it into your desired installation path and link the binary to be executable from anywhere:

sudo mv dislib_v* /opt/dislib
sudo ln -s /opt/dislib/dislib /usr/local/bin/dislib

3. Start dislib in your development directory

Initialize the dislib where your source code will be (you can reinit anytime). This will allow docker to access your local code and run it inside the container.

Note that the first time dislib needs to download the docker image from the registry and it may take a while.

# Without a path it operates on the current working directory.
dislib init

# You can also provide a path
dislib init /home/user/replace/path/

4. Run a sample application

First clone dislib repo:

git clone https://github.com/bsc-wdc/dislib.git

Init the dislib environment in the examples folder. This will mount the examples directory inside the container. The exec the desired example:

cd dislib/examples
dislib init
dislib exec clustering_comparison.py

The source files path are resolved from the init directory. Notice the difference if the dislib is initialized in the root of the repo:

cd dislib
dislib init
dislib exec examples/clustering_comparison.py

The log files of the execution can be found at $HOME/.COMPSs.

5. Adding more nodes

Note: adding more nodes is still in beta phase. Any suggestion, issue, or feedback is highly welcome and appreciated.

To add more computing nodes you can either let docker create more workers for you or manually create and config a custom node.

For docker just issue the desired number of workers to be added. For example, to add 2 docker workers:

dislib components add worker 2

You can check that both new computing nodes are up with:

dislib components list

If you want to add a custom node it needs to be reachable through ssh without user. Moreover, dislib will try to copy there the working_dir so it needs write permissions fot the scp.

For example, to add the local machine as worker node:

dislib components add worker '127.0.0.1:6'
  • '127.0.0.1': is the IP used for ssh (can also be a hostname like 'localhost', as long as it can be resolved).
  • '6': desired number of available computing units for the new node.

Please be aware that dislib components will not list your custom nodes because they are not docker processes and thus it can't be verified if they are up and running.

Availability

Currently the following Supercomputers have already PyCOMPSs installed and ready to use. If you need help configuring your own cluster or supercomputer drop us an email and we will be pleased to help.

  • Marenostrum 4 - Barcelona Supercomputing Center (BSC)
  • Minotauro - Barcelona Supercomputing Center (BSC)
  • Nord 3 - Barcelona Supercomputing Center (BSC)
  • Cobi - Barcelona Supercomputing Center (BSC)
  • Juron - Jülich Supercomputing Centre (JSC)
  • Jureca - Jülich Supercomputing Centre (JSC)
  • Ultraviolet - The Genome Analysis Center (TGAC)
  • Archer - University of Edinburgh’s Advanced Computing Facility (ACF)

Supported architectures:

Contributing

Contributions are welcome and very much appreciated. We are also open to starting research collaborations or mentoring if you are interested in or need assistance to implement new algorithms. Please refer to our Contribution Guide for more details.

License

GNU GENERAL PUBLIC LICENSE Version 3.0, see LICENSE

About

The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 93.0%
  • Jupyter Notebook 5.5%
  • Other 1.5%