-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Lightning-AI pytorch-lightning Discussions
Pinned Discussions
Sort by:
Latest activity
Categories, most helpful, and community links
Categories
Community links
Discussions
-
You must be logged in to vote ⚡ Low GPU Utilization
accelerator: cudaCompute Unified Device Architecture GPU performanceSakurakdx askedAug 15, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote 🤖 When I set num_works> 0, there is a error Producer process has been terminated before all shared CUDA tensors released
accelerator: cudaCompute Unified Device Architecture GPU -
You must be logged in to vote ⚡ Tensors must be CUDA and dense on using DDP
distributedGeneric distributed-related topic accelerator: cudaCompute Unified Device Architecture GPU hadarishav askedJul 6, 2020 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote 🤖 extra process when running ddp across multiple GPUs
strategy: ddpDistributedDataParallel accelerator: cudaCompute Unified Device Architecture GPU -
You must be logged in to vote 🤖 Multi gpus resume error
checkpointingRelated to checkpointing accelerator: cudaCompute Unified Device Architecture GPU -
You must be logged in to vote 😎 pytorch_lightning.utilities.exceptions.MisconfigurationException: GPUAccelerator can not run on your system since the accelerator is not available. The following accelerator(s) is available and can be passed into
accelerator: cudaacceleratorargument ofTrainer: ['cpu'].Compute Unified Device Architecture GPU -
You must be logged in to vote ⚡ Cuda DeviceStatsMonitor
callback: device stats accelerator: cudaCompute Unified Device Architecture GPU peterbjorgensen askedDec 15, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ Access a registered buffer is very slow
accelerator: cudaCompute Unified Device Architecture GPU juliendenize askedJan 15, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Answered -
You must be logged in to vote ⚡ Using
distributedval_dataloaderafter training on multiple gpus seems to return the batchesgpu-counttimesGeneric distributed-related topic accelerator: cudaCompute Unified Device Architecture GPU daMichaelB askedOct 27, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ Trainer cannot find available GPUS
accelerator: cudaCompute Unified Device Architecture GPU omrishac askedAug 3, 2021 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote 🤖 Can't get multi-gpu to work anymore
strategy: ddpDistributedDataParallel accelerator: cudaCompute Unified Device Architecture GPU -
You must be logged in to vote ⚡ Training is slow on GPU
accelerator: cudaCompute Unified Device Architecture GPU performancemtomic123 askedSep 28, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote 🤖 DDP deadlock detected from rank 1 and CUDA error: operation not supported on A10
distributedGeneric distributed-related topic accelerator: cudaCompute Unified Device Architecture GPU -
You must be logged in to vote ⚡ MNIST demo is not utilizing the GPU
example fabriclightning.fabric.Fabric accelerator: cudaCompute Unified Device Architecture GPU performancedelip askedAug 18, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ How to use lightning with a dataset that is generated on the GPU?
data handlingGeneric data-related topic accelerator: cudaCompute Unified Device Architecture GPU turian askedAug 16, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote ⚡ How to register a (repeatedly) sampled random tensor?
data handlingGeneric data-related topic lightningmodulepl.LightningModule accelerator: cudaCompute Unified Device Architecture GPU RylanSchaeffer askedAug 10, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Answered -
You must be logged in to vote ⚡ Why does pytorch lightning cause more GPU memory usage?
accelerator: cudaCompute Unified Device Architecture GPU performance plGeneric label for PyTorch Lightning package chuzheng88 askedJul 14, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote 🤖 How to use all the available GPUs
accelerator: cudaCompute Unified Device Architecture GPU trainer: argument -
You must be logged in to vote ⚡ when and how the trainer or module move the data to gpu?
data handlingGeneric data-related topic accelerator: cudaCompute Unified Device Architecture GPU plGeneric label for PyTorch Lightning package FutureWithoutEnding askedJul 18, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Answered -
You must be logged in to vote ⚡
strategy: dp (removed in pl)RuntimeError: Expected all tensors to be on the same deviceDataParallel strategy: ddpDistributedDataParallel accelerator: cudaCompute Unified Device Architecture GPU plGeneric label for PyTorch Lightning package ddicostanzo askedJul 15, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Answered -
You must be logged in to vote ⚡ How to balance the GPU load
strategy: ddpDistributedDataParallel accelerator: cudaCompute Unified Device Architecture GPU performanceYanhaoWu askedMay 23, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote 🤖 DDP - Synchronization on DGX - Use CPUs or GPU-to-GPU interconnect
accelerator: cudaCompute Unified Device Architecture GPU -
You must be logged in to vote 🤖 What's the relationship between number of gpu and batch size (global batch size))
distributedGeneric distributed-related topic accelerator: cudaCompute Unified Device Architecture GPU -
You must be logged in to vote ⚡ one_hot to cuda
accelerator: cudaCompute Unified Device Architecture GPU ironv askedJun 5, 2022 in Lightning Trainer API: Trainer, LightningModule, LightningDataModule · Unanswered -
You must be logged in to vote 🤖 How to carry out validation loop on one single GPU
accelerator: cudaCompute Unified Device Architecture GPU trainer: validate