-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Issues: Lightning-AI/pytorch-lightning
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Remove the Related to checkpointing
performance
refactor
optimizer_to_device
logic if possible
checkpointing
#20165
opened Aug 5, 2024 by
awaelchli
training time increase epoch by epoch
bug
Something isn't working
help wanted
Open to be worked on
performance
repro needed
The issue is missing a reproducible example
ver: 2.2.x
#20076
opened Jul 12, 2024 by
Eric-Lin-CVTE
CPU-Memory keeps accumulating during Something isn't working
performance
trainer: predict
ver: 2.1.x
trainer.predict
bug
#19398
opened Feb 2, 2024 by
surajpaib
Investigate Resident Memory Increase during Inference
bug
Something isn't working
help wanted
Open to be worked on
performance
ver: 2.0.x
#18640
opened Sep 26, 2023 by
ZekunZh
CombinedLoader
takes a long time when num_workers > 0
bug
#18584
opened Sep 19, 2023 by
johnathanchiu
Memory Leak when instantiating Fabric multiple times
bug
Something isn't working
fabric
lightning.fabric.Fabric
performance
strategy: deepspeed
ver: 2.0.x
#18356
opened Aug 21, 2023 by
vkakerbeck
Investigate FSDP + CPU Offload performance in Trainer
performance
strategy: fsdp
Fully Sharded Data Parallel
ver: 2.1.x
Running out of memory when resuming the training from a checkpoint
bug
Something isn't working
checkpointing
Related to checkpointing
performance
repro needed
The issue is missing a reproducible example
ver: 2.0.x
#18059
opened Jul 11, 2023 by
RJPenic
2x slower training speed with FSDP when switching from lightning 1.9 to 2.0
bug
Something isn't working
performance
strategy: fsdp
Fully Sharded Data Parallel
ver: 2.0.x
#18028
opened Jul 8, 2023 by
anthonyhu
self.log(.., on_epoch=True) runs extremely slow
bug
Something isn't working
logging
Related to the `LoggerConnector` and `log()`
performance
repro needed
The issue is missing a reproducible example
ver: 2.0.x
#17988
opened Jul 4, 2023 by
LinWeizheDragon
Lightning 2.0 CPUAccelerator is extremely slow!
bug
Something isn't working
performance
repro needed
The issue is missing a reproducible example
ver: 2.0.x
waiting on author
Waiting on user action, correction, or update
#17169
opened Mar 22, 2023 by
ricardorei
Leak RAM when training 8 core TPU
accelerator: tpu
Tensor Processing Unit
bug
Something isn't working
help wanted
Open to be worked on
performance
repro needed
The issue is missing a reproducible example
#16876
opened Feb 26, 2023 by
cuong3004
Updating lightning from v1.7.7 to 1.8.0 significantly affects results
accelerator: cuda
Compute Unified Device Architecture GPU
performance
strategy: ddp
DistributedDataParallel
#15682
opened Nov 14, 2022 by
cjsg
Strange Performance issues with PL + FFCV
3rd party
Related to a 3rd-party
data handling
Generic data-related topic
performance
ver: 1.9.x
#14189
opened Aug 12, 2022 by
lebrice
RichProgressBar
in v1.6 is slower than v1.5
bug
Training with Large Dataset Causes Infinite Stall
bug
Something isn't working
performance
waiting on author
Waiting on user action, correction, or update
#13126
opened May 22, 2022 by
fishbotics
Support mosaic optimizations as plugins
3rd party
Related to a 3rd-party
feature
Is an improvement or enhancement
performance
Integrate TorchTensorRt in order to increase speed during inference
3rd party
Related to a 3rd-party
feature
Is an improvement or enhancement
performance
ProTip!
Find all open issues with in progress development work with linked:pr.