How to deal with forecast rates ≤ 0 in any LL-based test?

It's not a surprise that zero (or even negative) rates cause issues in LL scores, but I believe pyCSEP does not _generally_ check, deal, or warn in LL-based tests if forecasts contain them (an example: Akinci's HAZGRIDX in the 5-yr Italy experiment).

This affects every grid-based and catalog-based test except the N-test.

I noticed one exception: rates ≤ 0 are silently omitted in [binomial_evaluations](https://github.com/SCECcode/pycsep/blob/master/csep/core/binomial_evaluations.py) using masked arrays (`numpy.ma.masked_where`). But this is not an optimal treatment, because it provides a model the opportunity to cheat and game the system: in areas where it is unsure or expects low seismicity, it simply forecasts 0; if a target event should occur in such bins, it won't count in the testing. Apart that, excluding rates ≤ 0 could trigger a corner case in the T-test when all target event rates are ≤ 0 (see case 3 in #225).

So the better approach is to *replace* forecast rates ≤ 0 (in a reproducible way, not randomly), e.g., with
- the minimum among all forecast rates,
- an order of magnitude below the minimum,
- the average among the surrounding bins, or
- ...

It would be convenient to write a separate function for this treatment and use it
- at the top of `.core.poisson_evaluations._poisson_likelihood_test()`,
- at the top of `.core.poisson_evaluations._t_test_ndarray()`,
- at the top of `.utils.calc._compute_likelihood()`, and
- somewhere in the middle of `.core.catalog_evaluations.magnitude_test()`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to deal with forecast rates ≤ 0 in any LL-based test? #226

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to deal with forecast rates ≤ 0 in any LL-based test? #226

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions