Feature/save intermediate ml diag data#200
Conversation
| @@ -3,116 +3,75 @@ | |||
| import numpy as np | |||
There was a problem hiding this comment.
Github isn't showing this diff by default since it is large. This file originally contained all the plotting functions for metrics and diagnostics; the main changes are
- some of the "metrics" plots got moved out
- plotting functions take the same single common dataset input
|
sample format of saved metrics netcdf: |
nbren12
left a comment
There was a problem hiding this comment.
Thanks Anna, the main method is now nice and clean. I think some attention should be placed on removing the use of global constants in the metrics.py file. I think my suggested refactors will make this code both more reusable and robust to future changes in variables names and conventions.
| @@ -126,3 +130,46 @@ def net_heating_from_dataset(ds: xr.Dataset, suffix: str = None) -> xr.DataArray | |||
| ds["PRATEsfc" + suffix], | |||
There was a problem hiding this comment.
hard code here. I expect this name will change in future versions.
|
|
||
| # create and save metrics dataset | ||
| # metrics: r2 global values, r2 pressure level profiles, MSE at locations | ||
| ds_metrics = create_metrics_dataset(ds) |
There was a problem hiding this comment.
This high level structure is very good.
|
Thanks for the review @nbren12 , ready for re-review. |
nbren12
left a comment
There was a problem hiding this comment.
Great! Thanks for all the changes. I have some very minor comments below which you don't necessarily have to address if you don't want to.
| ds_test = ds.sel(dataset=DATASET_NAME_FV3_TARGET) | ||
| ds_hires = ds.sel(dataset=DATASET_NAME_SHIELD_HIRES) | ||
|
|
||
| ds_metrics = create_metrics_dataset(ds_pred, ds_test, ds_hires) |
There was a problem hiding this comment.
Great! this structure is pretty clear.
| return ds_metrics | ||
|
|
||
|
|
||
| def plot_metrics(ds_metrics, output_dir, dpi_figures): |
There was a problem hiding this comment.
I would consider moving these plotting routines to another module to separate the plotting and computation code even more.
Refactor for offline ML diagnostics workflow.