Skip to content

Log loss metric can be Infinity or NaN #2708

Open
@rogancarr

Description

@rogancarr

For binary classification (and perhaps multiclass classification) the log loss can be infinite. The log loss reduction can also be negative infinity, as it is a shifting and rescaling of the log loss.

Similarly, the log loss can be a NaN. This is specifically guarded against in the code, but does seems like a bug too.

The culprit for both cases lies in the initial calculations in the ProcessRow() method of the Aggregator for the BinaryClassifierEvaluator.

Double logloss;
if (!Single.IsNaN(prob))
{
    if (_label > 0)
    {
        // REVIEW: Should we bring back the option to use ln instead of log2?
        logloss = -Math.Log(prob, 2);
    }
    else
        logloss = -Math.Log(1.0 - prob, 2);
}
else
    logloss = Double.NaN;

I propose that to guard against infinities we add an epsilon before taking the log.

To guard against NaNs, we will need to fix the probability calculations (e.g. in the calibrator(s)).

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Priority of the issue for triage purpose: Needs to be fixed at some point.bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions