You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I stumbled across an issue where my losses all went to NaN using binary_cross_entropy_with_logit. I quickly found out that this happens for high logit values. When sigmoid gets applied to them the result will be a 1.0, applying an affine transformation of -1 * y + 1 then results in 0.0 and finally taking the log of 0.0 yields NaN.
let right_side = (target.affine(-1.,1.))? * inp.affine(-1.,1.)?.log()?;
I looked up in the pytorch source how they did it and assembled a more or less identical way in candle (see PR).
I haven't actually looked up the mathematical definition of the operation or dug deep into examples to see if the proposed implementation is flawed as well, so take the PR with a grain of salt ;)
The text was updated successfully, but these errors were encountered:
Hi,
I stumbled across an issue where my losses all went to
NaN
usingbinary_cross_entropy_with_logit
. I quickly found out that this happens for high logit values. When sigmoid gets applied to them the result will be a 1.0, applying an affine transformation of-1 * y + 1
then results in0.0
and finally taking the log of0.0
yieldsNaN
.candle/candle-nn/src/loss.rs
Line 66 in 3d1dc06
I looked up in the pytorch source how they did it and assembled a more or less identical way in candle (see PR).
I haven't actually looked up the mathematical definition of the operation or dug deep into examples to see if the proposed implementation is flawed as well, so take the PR with a grain of salt ;)
The text was updated successfully, but these errors were encountered: