-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Issues: Lightning-AI/pytorch-lightning
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Resuming not correct when Something isn't working
loops
Related to the Loop API
ver: 2.2.x
max_steps
corresponds to the end of an epoch
bug
Validation runs only for one iteration when restarting from checkpoint mid-epoch, wrongly reporting validation loss
bug
Something isn't working
help wanted
Open to be worked on
loops
Related to the Loop API
#19549
opened Feb 29, 2024 by
pimdh
calling iter twice messes up dataloaders with queues
bug
Something isn't working
data handling
Generic data-related topic
loops
Related to the Loop API
ver: 2.1.x
Potential off by 1 error when resuming training of mid-epoch checkpoint
bug
Something isn't working
help wanted
Open to be worked on
loops
Related to the Loop API
ver: 2.1.x
#19367
opened Jan 29, 2024 by
ivnle
Spurious validation step when restarting with a checkpoint when Something isn't working
help wanted
Open to be worked on
loops
Related to the Loop API
ver: 2.0.x
max_steps
is set in the trainer
bug
#18645
opened Sep 26, 2023 by
arnaudstiegler
Incorrect batch progress saved in checkpoint at every_n_train_steps
bug
Something isn't working
help wanted
Open to be worked on
loops
Related to the Loop API
repro needed
The issue is missing a reproducible example
ver: 1.9.x
ver: 2.1.x
#18060
opened Jul 11, 2023 by
shuaitang5
trainer.fit_loop.setup_data()
does not refresh train dataset in LightningModule
data handling
#17327
opened Apr 11, 2023 by
LarsKue
Step when validation happens drifts for Something isn't working
help wanted
Open to be worked on
loops
Related to the Loop API
ver: 2.0.x
val_check_interval
when gradient accumulation turned on
bug
Global step reset when restoring checkpoints with trainer.validate
checkpointing
Related to checkpointing
feature
Is an improvement or enhancement
loops
Related to the Loop API
pl
Generic label for PyTorch Lightning package
num_training_batches
is inf
in configure_optimizers
bug
When test, the printed log does not show the full text of the metrics, when named in Chinese
bug
Something isn't working
help wanted
Open to be worked on
loops
Related to the Loop API
trainer: test
#14837
opened Sep 22, 2022 by
heng-yuwen
5 tasks done
TrainingEpochLoop._should_check_val_fx discrepancy between continued run <> restore from ckpt
bug
Something isn't working
checkpointing
Related to checkpointing
help wanted
Open to be worked on
loops
Related to the Loop API
#14579
opened Sep 7, 2022 by
Anner-deJong
check_val_every_n_epoch bug with list of dataloaders
bug
Something isn't working
loops
Related to the Loop API
ProTip!
What’s not been updated in a month: updated:<2024-09-30.