Open
Description
It would be nice if I can pass a checkpoint folder to the object detection trainer option, which can be used to save and load intermediate weight for every N epoch/step/epic. In that case I don't have to retrain the model from the beginning everytime.
In the meantime, It would also be nice if object detection trainer will return instead of throw a cancellationException when mlcontext.CancelExcute
get called.
Both features will allow user to pause training process whenever they want and restart training from preivous progress. This would be super useful for deep learning scenarios.
Describe the solution you'd like
- in training option, accept a checkpoint folder
- when cancelling, don't throw error, just return current result
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment