Description
Bug description
In v1.9, a model's validation_epoch_end()
gets called before a Callback's on_validation_epoch_end()
, but in v2.0, a model's on_validation_epoch_end()
gets called after a Callback's on_validation_epoch_end()
.
In my use case, which worked under v1.9, there is a Callback that implements on_validation_epoch_end()
where it reads the validation metric from trainer.logged_metrics
, which has been updated with the validation metric by the model's validation_epoch_end()
. It then checks whether there has been an improvement and logs that. In v2.0, this approach no longer works as the invocation order has changed.
One workaround is to do all this in the model itself and skip the Callback. However, I prefer to do the improvement checking and logging in a Callback instead of in the model because the necessary member variables would be useless if the model is not used for training (e.g. used only for testing or inference). Also, using a callback is a cleaner and more modular approach.
For my use case, it may be helpful to be able to express the priorities of the callbacks relative to one another and to the model. There may also be other, simpler, solutions.
How to reproduce the bug
No response
Error messages and logs
# Error messages and logs here please
Environment
Current environment
#- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow):
#- PyTorch Lightning Version (e.g., 1.5.0):
#- Lightning App Version (e.g., 0.5.2):
#- PyTorch Version (e.g., 2.0):
#- Python version (e.g., 3.9):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):
#- Running environment of LightningApp (e.g. local, cloud):
More info
No response