-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IMPRO-1814. Add sig phase mask #1356
Conversation
- The acceptance test for snow-fraction has masked data as inputs (entire mask is false) which results in a masked output (masked where divide-by-zero occurs, no infilling) - Masks now handled such that any input masks are preserved (mismatches are additive) and divide-by-zero points are found and filled in as intended. - This changes the KGO!
- So long as the input snow-fraction field has valid data at all points, it will be used, even if it is a masked-array - Added KGO (which has a mask - all-False)
Codecov Report
@@ Coverage Diff @@
## master #1356 +/- ##
==========================================
+ Coverage 96.10% 96.18% +0.08%
==========================================
Files 84 85 +1
Lines 7212 7285 +73
==========================================
+ Hits 6931 7007 +76
+ Misses 281 278 -3
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this Stephen. I think I am mostly confused about what you envisage as the behaviour for masked snow/rain/sleet inputs. Other than that I have just a few comments.
...er_tests/psychrometric_calculations/test_significant_phase_mask/test_SignificantPhaseMask.py
Outdated
Show resolved
Hide resolved
- Improves doc-string - Simplifies handling of masks - Improves test
- Removes handling of masked data in snow-fraction plugin as there is no use-case for this. - Snow-fraction acceptance test now checks that inputs can be in either order. - Simplifies test for masked data in significant_phase_mask.py
- Removes handling of masked data in snow-fraction plugin as there is no use-case for this. - Snow-fraction acceptance test now checks that inputs can be in either order. - Simplifies test for masked data in significant_phase_mask.py - Adds model-id-attr option to allow this attribute to be passed to the output cubes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes Stephen. It is clearer to my mind what is going on now. Two tiny comments.
- Updates doc-string - Suppresses expected warning messages
model_id_attr (str): | ||
Name of the attribute used to identify the source model for | ||
blending. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed? Unless this mask is blended between models, it doesn't need this attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not completely convinced that it is, but we will eventually be blending engluk rain/sleet/snow data with enukx data and the blending will need this attribute to know what to do. Strictly, we don't need it on the snow-fraction or phase-mask data, but it does correctly describe them, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll be blending the actual diagnostic data - but those fields should have a model ID attribute already present on the probability of precipitation field, which is where they should inherit that metadata from.
I don't think this actively hurts, but in general I'd rather keep the metadata minimal. This attribute is explicitly not in the IMPROVER standard for level 3 products (see @bjwheltor 's latest metadata audit), and we only keep it after level 1 because the blending plugin needs some way to distinguish the inputs. (When we use EC data we may need to change this anyway, since as I understand it they use a coordinate rather than an attribute to identify the input model.)
model_id_attr (str): | ||
Name of the attribute used to identify the source model for | ||
blending. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous answer.
with np.errstate(divide="ignore", invalid="ignore"): | ||
snow_fraction = self.snow.data / (self.rain.data + self.snow.data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You expect this to produce np.nan
where there is no precipitation. How does the SignificantPhaseMask
plugin handle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't have to. The snow-fraction plugin handles these by interpolating to the NaN points from nearby valid points.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which case, is there a test for the snow fraction plugin to ensure it does that please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On discussion - yes there is, it just wasn't immediately clear from the docstring. Stephen is going to update the docstring to specify that it tests a point with no precip - and the number that comes out is not masked or nan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
NotImplementedError, match="SignificantPhaseMask cannot handle masked data" | ||
): | ||
SignificantPhaseMask()(input_cube, "snow") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your snow fraction field has no mask, but will contain np.nan
where there is no precipitation. You need a test for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I do. See previous answer.
...er_tests/psychrometric_calculations/test_significant_phase_mask/test_SignificantPhaseMask.py
Show resolved
Hide resolved
- Updates doc-string - Moves expected data to be a global variable (as it is used twice) - Extends test doc-string to make it more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just would like to be sure no np.nan
s or masks, so could you add snow fraction unit tests to check this please?
- Updates doc-string
Complexity increasing per file
==============================
- improver_tests/acceptance/test_phase_mask.py 2
- improver/cli/phase_mask.py 1
- improver/psychrometric_calculations/significant_phase_mask.py 5
- improver_tests/snow_fraction/test_SnowFraction.py 4
- improver_tests/psychrometric_calculations/test_significant_phase_mask/test_SignificantPhaseMask.py 6
See the complete overview on Codacy |
* Adds SignificantPhaseMask plugin and tests * Adds SignificantPhaseMask CLI and tests * SnowFraction handles masked-data - The acceptance test for snow-fraction has masked data as inputs (entire mask is false) which results in a masked output (masked where divide-by-zero occurs, no infilling) - Masks now handled such that any input masks are preserved (mismatches are additive) and divide-by-zero points are found and filled in as intended. - This changes the KGO! * SignificantPhaseMask now handles masked arrays if all(False) - So long as the input snow-fraction field has valid data at all points, it will be used, even if it is a masked-array - Added KGO (which has a mask - all-False) * Addresses Codacy errors * Attepts to fix Sphinx error * Reduces duplication in tests with more parametrization. * Attempts to fix Sphinx error * Black * Adds test for invalid phase name request. * First review - Improves doc-string - Simplifies handling of masks - Improves test * Black again. * First review - Removes handling of masked data in snow-fraction plugin as there is no use-case for this. - Snow-fraction acceptance test now checks that inputs can be in either order. - Simplifies test for masked data in significant_phase_mask.py * First review - Removes handling of masked data in snow-fraction plugin as there is no use-case for this. - Snow-fraction acceptance test now checks that inputs can be in either order. - Simplifies test for masked data in significant_phase_mask.py - Adds model-id-attr option to allow this attribute to be passed to the output cubes * Black * First review - Updates doc-string - Suppresses expected warning messages * Second review - Updates doc-string - Moves expected data to be a global variable (as it is used twice) - Extends test doc-string to make it more explicit. * Second review - Updates doc-string
IMPRO-1814
Updates SnowFraction plugin to handle masked input data as divide-by-zero gives a different behaviour (returns masked-True rather than np.nan).
Adds SignificantPhaseMask plugin and CLI (phase-mask) to convert a snow-fraction field into a mask for one of rain, sleet or snow.
Testing: