-
Notifications
You must be signed in to change notification settings - Fork 24
remove normalization of likelihood on cl-test #117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
csep/core/poisson_evaluations.py
Outdated
# set seed for the likelihood test | ||
if seed is not None: | ||
numpy.random.seed(seed) | ||
# def _poisson_likelihood_test(forecast_data, observed_data, num_simulations=1000, random_numbers=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this comment somehow made it to the commit on accident.
csep/core/poisson_evaluations.py
Outdated
""" | ||
Computes the likelihood-test from CSEP using an efficient simulation based approach. | ||
Args: | ||
forecast_data (numpy.ndarray): nd array where [:, -1] are the magnitude bins. | ||
observed_data (numpy.ndarray): same format as observation. | ||
num_simulations: default number of simulations to use for likelihood based simulations | ||
seed: used for reproducibility of the prng | ||
random_numbers (numpy.ndarray): can supply an explicit list of random numbers, primarily used for software testing | ||
use_observed_counts (bool): if true, will simulate catalogs using the observed events, if false will draw from poisson distribution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these lines should be indented
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indented, Bill.
The lines are properly idented now :) |
fixes #78 |
- greatly simplified logic to choose number of simulated events and handing of simulated likelihood values - fixed formatting issue of line breaks and tabs vs. spaces - cleaned up other various formatting issues in poisson_evaluations.py
the ci build/test failed because of some python tabs vs. spaces issue in the poisson_evaluations.py file. @bayonato89 i fetched the branch from your forked copied and i fixed the spacing issues in the commit. the build and tests run without error. i believe the original implementation of this pull request didn't produce the correct normalization / conditioning for the cl test. i went through the desired behavior for normalizing and conditioning
implementation from previous commit # used for conditional-likelihood, magnitude, and spatial tests to normalize the rate-component of the forecasts.
if use_observed_counts:
scale = n_obs / n_fore
if normalize_likelihood:
expected_forecast_count = numpy.sum(forecast_data)
log_bin_expectations = numpy.log(forecast_data.ravel())
else:
expected_forecast_count = int(n_obs)
log_bin_expectations = numpy.log(forecast_data.ravel() * scale)
else:
expected_forecast_count = numpy.sum(forecast_data)
log_bin_expectations = numpy.log(forecast_data.ravel())
# ... code omitted ...
for idx in range(num_simulations):
if use_observed_counts:
if normalize_likelihood:
num_events_to_simulate = int(n_obs)
else:
num_events_to_simulate = expected_forecast_count
else:
num_events_to_simulate = int(numpy.random.poisson(expected_forecast_count)) the error appears to be in the simplified implementation expected_forecast_count = numpy.sum(forecast_data)
log_bin_expectations = numpy.log(forecast_data.ravel())
# used for conditional-likelihood, magnitude, and spatial tests to normalize the rate-component of the forecasts
if use_observed_counts and normalize_likelihood:
scale = n_obs / n_fore
expected_forecast_count = int(n_obs)
log_bin_expectations = numpy.log(forecast_data.ravel() * scale)
# ... code omitted ...
for idx in range(num_simulations):
if use_observed_counts:
num_events_to_simulate = int(n_obs)
else:
num_events_to_simulate = int(numpy.random.poisson(expected_forecast_count)) in this implementation, we scale the rates when normalizing the likelihood and simulating using the observed event counts, else the rates aren't scaled. we now either simulate using the number of observed events or numbers drawn from a poisson distribution. the simplified version matches our desired behavior. |
Codecov Report
@@ Coverage Diff @@
## master #117 +/- ##
==========================================
- Coverage 57.23% 57.19% -0.05%
==========================================
Files 19 19
Lines 3136 3133 -3
Branches 452 452
==========================================
- Hits 1795 1792 -3
Misses 1231 1231
Partials 110 110
Continue to review full report at Codecov.
|
@bayonato89 im merging this into master. nice job on finding this issue! |
No description provided.