Description
It's not a surprise that zero (or even negative) rates cause issues in LL scores, but I believe pyCSEP does not generally check, deal, or warn in LL-based tests if forecasts contain them (an example: Akinci's HAZGRIDX in the 5-yr Italy experiment).
This affects every grid-based and catalog-based test except the N-test.
I noticed one exception: rates ≤ 0 are silently omitted in binomial_evaluations using masked arrays (numpy.ma.masked_where
). But this is not an optimal treatment, because it provides a model the opportunity to cheat and game the system: in areas where it is unsure or expects low seismicity, it simply forecasts 0; if a target event should occur in such bins, it won't count in the testing. Apart that, excluding rates ≤ 0 could trigger a corner case in the T-test when all target event rates are ≤ 0 (see case 3 in #225).
So the better approach is to replace forecast rates ≤ 0 (in a reproducible way, not randomly), e.g., with
- the minimum among all forecast rates,
- an order of magnitude below the minimum,
- the average among the surrounding bins, or
- ...
It would be convenient to write a separate function for this treatment and use it
- at the top of
.core.poisson_evaluations._poisson_likelihood_test()
, - at the top of
.core.poisson_evaluations._t_test_ndarray()
, - at the top of
.utils.calc._compute_likelihood()
, and - somewhere in the middle of
.core.catalog_evaluations.magnitude_test()
.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status