[2022/08/04] Chinese blogs completed
[2022/07/23] Chinese blogs are starting to be updated on 闪闪红星闪闪@知乎
[2022/07/20] Repository installation completed
Ubuntu 18.04
CUDA 10.1
Python==3.7.3
PyTorch==1.8.1
conda create -n pytorch-multi-GPU-training-tutorial python=3.7.3
conda activate pytorch-multi-GPU-training-tutorial
pip install https://download.pytorch.org/whl/cu101/torch-1.8.1%2Bcu101-cp37-cp37m-linux_x86_64.whl
pip install https://download.pytorch.org/whl/cu101/torchvision-0.9.1%2Bcu101-cp37-cp37m-linux_x86_64.whl
To run the tutorial, please click the link below:
-
Run single-machine-and-multi-GPU-DistributedDataParallel-launch.py
-
Run single-machine-and-multi-GPU-DistributedDataParallel-mp.py
Chinese Tutorial (闪闪红星闪闪@知乎)
The following tutorials are being published:
- 教程 | PyTorch 多 GPU 训练 - 入门与实践
- PyTorch 多GPU训练实践 (1) - 单机单 GPU
- PyTorch 多GPU训练实践 (2) - DP 代码修改
- PyTorch 多GPU训练实践 (3) - DDP 入门
- PyTorch 多GPU训练实践 (4) - DDP 进阶
- PyTorch 多GPU训练实践 (5) - DDP-torch.distributed.launch 代码修改
- PyTorch 多GPU训练实践 (6) - DDP-torch.multiprocessing 代码修改
- PyTorch 多GPU训练实践 (7) - slurm 集群安装
- PyTorch 多GPU训练实践 (8) - DDP- slurm 代码修改