Skip to content

apoorvkh/torchrunx

Repository files navigation

torchrunx 🔥

PyPI - Python Version PyPI - Version Docs GitHub License

Automatically launch functions and initialize distributed PyTorch environments on multiple machines

Installation

pip install torchrunx

Requirements:

  • Operating System: Linux
  • Python >= 3.8.1
  • PyTorch >= 2.0
  • Shared filesystem & passwordless SSH between hosts

Usage

# Simple example
def distributed_function():
    pass
import torchrunx as trx

trx.launch(
    func=distributed_function,
    func_kwargs={},
    hostnames=["node1", "node2"],  # or just: ["localhost"]
    workers_per_host=2
)

In a SLURM allocation

trx.launch(
    # ...
    hostnames=trx.slurm_hosts(),
    workers_per_host=trx.slurm_workers()
)

Compared to other tools

Contributing

We use the pixi package manager. Simply install pixi and run pixi shell in this repository. We use ruff for linting and formatting, pyright for static type checking, and pytest for testing. We build for PyPI and conda-forge. Our release pipeline is powered by Github Actions.

About

Easily run PyTorch on multiple GPUs & machines

Resources

License

Stars

Watchers

Forks

Contributors 4

  •  
  •  
  •  
  •