Skip to content

How to deal with forecast rates ≤ 0 in any LL-based test? #226

Open
@mherrmann3

Description

@mherrmann3

It's not a surprise that zero (or even negative) rates cause issues in LL scores, but I believe pyCSEP does not generally check, deal, or warn in LL-based tests if forecasts contain them (an example: Akinci's HAZGRIDX in the 5-yr Italy experiment).

This affects every grid-based and catalog-based test except the N-test.

I noticed one exception: rates ≤ 0 are silently omitted in binomial_evaluations using masked arrays (numpy.ma.masked_where). But this is not an optimal treatment, because it provides a model the opportunity to cheat and game the system: in areas where it is unsure or expects low seismicity, it simply forecasts 0; if a target event should occur in such bins, it won't count in the testing. Apart that, excluding rates ≤ 0 could trigger a corner case in the T-test when all target event rates are ≤ 0 (see case 3 in #225).

So the better approach is to replace forecast rates ≤ 0 (in a reproducible way, not randomly), e.g., with

  • the minimum among all forecast rates,
  • an order of magnitude below the minimum,
  • the average among the surrounding bins, or
  • ...

It would be convenient to write a separate function for this treatment and use it

  • at the top of .core.poisson_evaluations._poisson_likelihood_test(),
  • at the top of .core.poisson_evaluations._t_test_ndarray(),
  • at the top of .utils.calc._compute_likelihood(), and
  • somewhere in the middle of .core.catalog_evaluations.magnitude_test().

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions