Description
Description & Motivation
Revamp
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
To
GPU available: M1, using 1 devices
TPU available: v4-8, using 0 devices
IPU available: False, using 0 devices
HPU available: False, using 0 devices
The relevant code is: https://github.com/Lightning-AI/lightning/blob/f14ee9edbc8269054e12daf30b8681d530e73369/src/lightning/pytorch/trainer/setup.py#L145-L171
Pitch
If the accelerator is available, True
changes to the actual name of the accelerator used.
If it's unavailable, we still show False
.
For GPUs, the cuda|mps
field is gone, as it should be clear from the device.
I also propose that the GPU field shows the number of devices, instead of a used boolean.
We can get this info via
# CUDA
torch.cuda.get_device_name()
# TPU
from torch_xla.experimental import tpu
import torch_xla.core.xla_env_vars as xenv
# note: this needs a try-except as this will send a request
tpu.get_tpu_env()[xenv.ACCELERATOR_TYPE]
For MPS, HPU, IPU we would need to find out if we can get this information. In the meantime, we can still fallback to "True" for them.
This could be done by introducing an Accelerator.device_name(device)
staticmethod
Alternatives
One caveat is that this might be misleading with heterogeneous devices, as only rank zero prints this information.
Additional context
No response