Skip to content

binary_cross_entropy_with_logit numerically unstable for high logits #2561

Open
@BeneSim

Description

@BeneSim

Hi,

I stumbled across an issue where my losses all went to NaN using binary_cross_entropy_with_logit. I quickly found out that this happens for high logit values. When sigmoid gets applied to them the result will be a 1.0, applying an affine transformation of
-1 * y + 1 then results in 0.0 and finally taking the log of 0.0 yields NaN.

let right_side = (target.affine(-1., 1.))? * inp.affine(-1., 1.)?.log()?;

I looked up in the pytorch source how they did it and assembled a more or less identical way in candle (see PR).

I haven't actually looked up the mathematical definition of the operation or dug deep into examples to see if the proposed implementation is flawed as well, so take the PR with a grain of salt ;)

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions