-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop training on Inf/NaN loss #2070
Conversation
It would be good to have some way for user code to tell training has stopped prematurely. Not sure what that would look like: return value, callback, flag, etc? |
One way would be to upgrade this to an error. It does depend a little on how complex/comprehensive we think I found #821, in which all options were discussed. And the idea was to write |
Raising |
The point of the exception was, I now see, was that you could throw it from inside the loss function passed to gradient, and still get to |
I missed that |
Ah OK, so you weren't proposing to throw and catch an error. It could just throw a DomainError, that seems the closest Base type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but if @darsnack or @CarloLucibello wouldn't mind having a once-over that would be great.
Thanks for taking the issue I raised so seriously, it is so nice to encounter active developers who listen to user requests. Thanks for your work <3 |
Closes #1981, by altering
train!
so that when it encounters an infinite / NaN loss, itprints a warningthrows an error, and stops. It stops before updating the model, because such an update will usually make everything NaN.Not sure it should stop, in fact. Maybe it should skip that datapoint & continue?
PR Checklist