-
Notifications
You must be signed in to change notification settings - Fork 20
Description
Refactor request
The approach to calculating logL is inconsistent across lencomp, agecomp, and gen_sizecomp. gen_size calculates logL across females and males at same time and includes ALL bins, even those in tails. It does not include a tail compression feature. So, it is dependent on the small added constant to allow logL calculation for bins with no fish. This difference is a major impediment to merging the logL calcs across the three comp methods.
Inconsistencies include:
a. whether or not a very tiny constant is added to expected values even when the user has not specified using an added constant
b. whether or not tail compression is used
Expected behavior
Potential solution:
if there are zeroes in the comp data, then adding a small constant is needed, but only because of the ln() term in:
SzFreq_like(SzFreq_LikeComponent(f, k)) -= SzFreq_sampleN(iobs) * SzFreq_obs(iobs)(z1, z2) * log(SzFreq_exp(iobs)(z1, z2));
but if a very tiny constant is added only to that term, the nan is avoided.
SzFreq_like(SzFreq_LikeComponent(f, k)) -= SzFreq_sampleN(iobs) * SzFreq_obs(iobs)(z1, z2) * log(SzFreq_exp(iobs)(z1, z2) + 1.0e-15);
This approach is already used in agecomp logL by this line:
exp_a(f, i) /= (1.0e-15 + sum(exp_a(f, i)));
Note that this adjustment is in addition to the user-specified constant added to both the obs and expected.
The agecomp approach is a bit more accurate as it is scaling to 1.0 after adding the tiny constant, whereas the proposed gen_size approach adds it at the final logL calculation so is off by a miniscule amount in terms of summing to 1.0, but avoids another divide computation.
Also note that the this approach is not used for length comp observations. They rely on the user specified constant.
comments please: @kellijohnson-NOAA @iantaylor-NOAA @chantelwetzel-noaa @puntae
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status