Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Essential diagnostics #190

Open
dougiesquire opened this issue Jul 17, 2024 · 70 comments
Open

Essential diagnostics #190

dougiesquire opened this issue Jul 17, 2024 · 70 comments
Labels
all_configurations cice6 Related to CICE6 help wanted Extra attention is needed mom6 Related to MOM6 priority:high ww3 Related to WW3

Comments

@dougiesquire
Copy link
Collaborator

The set of diagnostics output by access-om3 configurations is still mostly just inherited from CESM. There are possibly many diagnostics that we want that aren't requested, and others that are requested that we don't want. It would be helpful to compile a set of "essential" diagnostics that should always be requested in configurations so that key metrics/analyses can be calculated.

I'm probably being naive in thinking that it's possible to come up with a single list of essential diagnsotics. In that case, we still need to know what diagnostics are needed to calculate particular metrics so that we can ensure everything we need is saved in upcoming test runs.

@dougiesquire dougiesquire added help wanted Extra attention is needed ww3 Related to WW3 mom6 Related to MOM6 cice6 Related to CICE6 all_configurations labels Jul 17, 2024
@aekiss
Copy link
Contributor

aekiss commented Jul 17, 2024

We should start with the CICE and MOM diagnostics used in ACCESS-OM2.

@dougiesquire
Copy link
Collaborator Author

Is there a single list of diagnostics for ACCESS-OM2?

@aekiss
Copy link
Contributor

aekiss commented Jul 17, 2024

The default configs are pretty consistent

@aekiss
Copy link
Contributor

aekiss commented Jul 17, 2024

It would be good if we could use make_diag_table to generate diag_table, as this makes it easier to create consistent filenames, e.g. when creating one file per variable.

I'm looking into this - see COSIMA/make_diag_table#5

@dougiesquire
Copy link
Collaborator Author

It would be good if we could use make_diag_table to generate diag_table, as this makes it easier to create consistent filenames, e.g. when creating one file per variable.

I'm looking into this - see COSIMA/make_diag_table#5

I was just doing the same thing. I'll leave it with you :)

@minghangli-uni
Copy link
Contributor

Essential diagnostics are needed as discussed in TWG.

  1. Zonal average temperature and salinity (i.e. depth/latitude maps) (Fig. 12 Kiss et al. 2020) [1993 - 2017]
  2. Time series of global average temperature, salinity and sea surface temperature. (Fig. 3 in Kiss et al. 2020), and sea surface height.
  3. Zonally integrated overturning in density / latitude space (Fig. 7 Kiss et al. 2020) [1993 - 2017]
  4. Time series of Drake Passage zonal transport. (Fig. 4 in Kiss et al. 2020)

@dougiesquire
Copy link
Collaborator Author

@minghang is working through what MOM5 diagnostics output in the ACCESS-OM2 default configs are available in MOM6 and what they are called.

@aekiss
Copy link
Contributor

aekiss commented Jul 19, 2024

I've confirmed that make_diag_table is compatible with MOM6 - see COSIMA/make_diag_table#5

@minghangli-uni
Copy link
Contributor

Great to know! Thanks @aekiss

@aekiss
Copy link
Contributor

aekiss commented Jul 19, 2024

To handle many output files I used these settings in input.nml:

 &diag_manager_nml
    debug_diag_manager = .true.
    issue_oor_warnings = .true.
    flush_nc_files = .true.
    max_axes          = 400
    max_files         = 200
    max_num_axis_sets = 200
 /

This also issues warnings in access-om3.err for any unavailable diagnostics, so a test run with an OM2 diag_table will reveal which diagnostics need to be renamed.

@aekiss
Copy link
Contributor

aekiss commented Jul 19, 2024

We'll also need to consider which ones to vertically remap onto a non-native grid - see https://mom6.readthedocs.io/en/main/api/generated/pages/Diagnostics.html#native-diagnostics

@minghangli-uni
Copy link
Contributor

Ref #178, we currently dont have a proper isopycnal coordinate at the moment.

However we dont have a proper 0.25deg density coordinate as was noted from ACCESS-NRI/access-om3-configs#40

@minghangli-uni
Copy link
Contributor

minghangli-uni commented Jul 22, 2024

Below diagnostics are the conversion between MOM5 and MOM6. The table is still being updated.

The diagnostics of MOM5, including description, unit, method and packing are generated by matching information from the MOM_diags.txt and diag_table. For diagnostics with missing description, unit, method and packing, these details are not present in the MOM_diags.txt or in mom5. I am not sure how these diagnostiics are traced. Any thoughts? @aekiss


Static 2D Grid Data

Diagnostic (MOM6) Diagnostic (MOM5) Description Unit Method Packing
area_t area_t Tracer cell area none real*4
area_u area_u Surface area of x-direction flow (U) cells none real*4
area_v N/A Surface area of y-direction flow (V) cells none real*4
drag_coeff drag_coeff Dimensionless bottom drag coefficient dimensionless none real*4
dxt dxt Ocean dxt on t-cells m none real*4
dxu dxu Ocean dxu on u-cells m none real*4
dyt dyt Ocean dyt on t-cells m none real*4
dyu dyu Ocean dyu on u-cells m none real*4
N/A geolat_c Ucell latitude degrees_N none real*4
N/A geolat_t Tracer latitude degrees_N none real*4
N/A geolon_c UV longitude degrees_E none real*4
N/A geolon_t Tracer longitude degrees_E none real*4
geolat_c N/A Latitude of corner (Bu) points degrees_north none real*4
geolat N/A Latitude of tracer (T) points degrees_north none real*4
geolat_v N/A Latitude of meridional velocity (Cv) points degrees_north none real*4
geolon_c N/A Longitude of corner (Bu) points degrees_east none real*4
geolon N/A Longitude of tracer (T) points degrees_east none real*4
geolon_v N/A Longitude of meridional velocity (Cv) points degrees_east none real*4
depth_ocean ht Ocean depth on t-cells m none real*4
depth_ocean hu Ocean depth on u-cells m none real*4
kmt Number of depth levels on t-grid dimensionless none real*4
kmu Number of depth levels on u-grid dimensionless none real*4

Monthly 3D Fields

Diagnostic (MOM6) Diagnostic (MOM5) Description Unit Method Packing
agessc age_global Sea water age since surface contact yr average real*4
buoyfreq2_wt Squared buoyancy frequency at T-cell bottom 1/s² average real*4
difvho diff_cbt_t Total vertical diffusion of temperature (w/o neutral) m²/s average real*4
dzt T-cell thickness m average real*4
rhopot0 pot_rho_0 Potential density referenced to 0 dbar kg/m³ average real*4
rhopot2 pot_rho_2 Potential density referenced to 2000 dbar kg/m³ average real*4
thetao pot_temp Potential temperature °C average real*4
so salt Practical Salinity psu average real*4
temp_xflux_adv Temperature x flux (advection)
temp_yflux_adv Temperature y flux (advection)
temp Conservative temperature °C average real*4
umo tx_trans T-cell i-mass transport kg/s average real*4
ty_trans_gm T-cell mass j-transport from GM kg/s average real*4
ty_trans_nrho_submeso T-cell j-mass transport from submesoscale param on neutral rho kg/s average real*4
ty_trans_rho_gm T-cell j-mass transport from GM on potential density kg/s average real*4
ty_trans_rho T-cell j-mass transport on potential density kg/s average real*4
ty_trans_submeso T-cell mass j-transport from submesoscale param kg/s average real*4
vmo ty_trans T-cell j-mass transport kg/s average real*4
uo u i-current m/sec average real*4
vo v j-current m/sec average real*4
vert_pv Vertical piece of Ertel PV: (f+zeta)*N² 1/sec³ average real*4
wd wt Dia-surface velocity T-points m/sec average real*4

Monthly 3D Squared Fields

Diagnostic (MOM6) Diagnostic (MOM5) Description Unit Method Packing
uo u i-current m/sec pow02 real*4
vo v j-current m/sec pow02 real*4

monthly 2d fields

Diagnostic (MOM6) Diagnostic (MOM5) Description Unit Method Packing
KHTH_t agm GM diffusivity at surface m^2/sec average real*4
KHTR_h aredi neutral diffusivity at k=1 m^2/sec average real*4
bmf_u bmf_u Bottom u-stress via bottom drag N/m^2 average real*4
bmf_v bmf_v Bottom v-stress via bottom drag N/m^2 average real*4
ekman_we Ekman vertical velocity averaged to wt-point m/s average real*4
eta_nonbouss surface height including steric contribution meter average real*4
latent_evap evap_heat latent heat flux into ocean (<0 cools ocean) W/m^2 average real*4
evap evap mass flux from evaporation/condensation (>0 enters ocean) (kg/m^3)*(m/sec) average real*4
heat_content_fprec fprec_melt_heat heat flux to melt frozen precip (<0 cools ocean) W/m^2 average real*4
fprec fprec snow falling onto ocean (>0 enters ocean) (kg/m^3)*(m/sec) average real*4
frazil_3d_int_z Vertical sum of ocn frazil heat flux over time step W/m^2 average real*4
lprec lprec liquid precip (including ice melt/form) into ocean (>0 enters ocean) (kg/m^3)*(m/sec) average real*4
LW lw_heat longwave flux into ocean (<0 cools ocean) W/m^2 average real*4
melt water flux transferred with sea ice form/melt (>0 enters ocean) (kg/m^3)*(m/sec) average real*4
seaice_melt_heat mh_flux heat into ocean due to melting ice (>0 heats ocean) (W/m^2) average real*4
MLD_003 mld mixed layer depth determined by density criteria m average real*4
net_sfc_heating surface ocean heat flux coming through coupler and mass transfer Watts/m^2 average real*4
pbo pbot_t bottom pressure on T cells' // trim ( model_type ) dbar average real*4
pme_net precip-evap into ocean (total w/ restore + normalize) (kg/m^3)*(m/sec) average real*4
pme_river mass flux of precip-evap+river via sbc (liquid frozen average real*4
N/A river mass flux of river (runoff + calving) entering ocean (kg/m^3)*(m/sec) average real*4
ficeberg N/A Frozen runoff (calving) and iceberg melt into ocean (kg/m^3)*(m/sec) average real*4
friver runoff Liquid runoff (rivers) into ocean (kg/m^3)*(m/sec) average real*4
N/A sea_level_sq square of effective sea level (eta_t + patm/(rho0*g)) on T cells m^2 average real*4
e_D sea_level effective sea level (eta_t + patm/(rho0*g)) on T cells meter average real*4
hfsso sens_heat sensible heat into ocean (<0 cools ocean) W/m^2 average real*4
net_heat_coupler sfc_hflux_coupler surface heat flux coming through coupler Watts/m^2 average real*4
heat_content_lrunoff sfc_hflux_from_runoff heat flux (relative to 0C) from liquid river runoff Watts/m^2 average real*4
Heat_PmE sfc_hflux_pme heat flux (relative to 0C) from pme transfer of water across ocean surface Watts/m^2 average real*4
salt_flux_in sfc_salt_flux_coupler sfc_salt_flux_coupler: flux from the coupler kg m-2 s-1 average real*4
sfc_salt_flux_ice kg m-2 s-1 average real*4
sfc_salt_flux_restore kg m-2 s-1 average real*4
sfdsi N/A Net salt flux into ocean at surface (restoring + sea-ice) kg m-2 s-1 average real*4
tos surface_pot_temp Sea Surface Temperature degC average real*4
sos surface_salt Sea Surface Salinity psu average real*4
SW_pen swflx shortwave flux into ocean (>0 heats ocean) W/m^2 average real*4
taux tau_x i-directed wind stress forcing u-velocity N/m^2 average real*4
tauy tau_y j-directed wind stress forcing v-velocity N/m^2 average real*4
temp_int_rhodz vertical sum of Conservative temperature * rho_dzt deg_C*(kg/m^3)*m average real*4
temp_xflux_adv_int_z z-integral of cprhodytutemp W average real*4
temp_xflux_gm_int_z z-integral cpgm_xfluxdytrho_dzttemp W average real*4
temp_xflux_ndiffuse_int_z z-integral cpndiffuse_xfluxdytrho_dzttemp W average real*4
temp_xflux_submeso_int_z z-integral cpsubmeso_xfluxdytrho_dzttemp W average real*4
temp_yflux_adv_int_z z-integral of cprhodxtvtemp W average real*4
temp_yflux_gm_int_z z-integral cpgm_yfluxdytrho_dzttemp W average real*4
temp_yflux_ndiffuse_int_z z-integral cpndiffuse_yfluxdxtrho_dzttemp W average real*4
temp_yflux_submeso_int_z z-integral cpsubmeso_yfluxdxtrho_dzttemp W average real*4
umo_2d tx_trans_int_z T-cell i-mass transport vertically summed trim ( transport_dims ) average real*4
vmo_2d ty_trans_int_z T-cell j-mass transport vertically summed trim ( transport_dims ) average real*4
seaice_melt wfiform water out of ocean due to ice form (>0 enters ocean) (kg/m^3)*(m/sec) average real*4
seaice_melt wfimelt water into ocean due to ice melt (>0 enters ocean) (kg/m^3)*(m/sec) average real*4

monthly 2d fields with different reduction methods

Diagnostic (MOM6) Diagnostic (MOM5) Description Unit Method Packing
MLD_003 mld mixed layer depth determined by density criteria m max real*4
tos surface_pot_temp Sea Surface Temperature degC max real*4

daily 2d fields

Diagnostic (MOM6) Diagnostic (MOM5) Description Unit Method Packing
bottom_temp Conservative temperature degC average real*4
frazil_3d_int_z Vertical sum of ocn frazil heat flux over time step W/m^2 average real*4
MLD_003 mld mixed layer depth determined by density criteria m average real*4
pme_river mass flux of precip-evap+river via sbc (liquid frozen average real*4
e_D sea_level effective sea level (eta_t + patm/(rho0*g)) on T cells meter average real*4
net_heat_coupler sfc_hflux_coupler surface heat flux coming through coupler Watts/m^2 average real*4
heat_content_lrunoff sfc_hflux_from_runoff heat flux (relative to 0C) from liquid river runoff Watts/m^2 average real*4
Heat_PmE. sfc_hflux_pme heat flux (relative to 0C) from pme transfer of water across ocean surface Watts/m^2 average real*4
tos surface_pot_temp
sos surface_salt
SSU usurf i-surface current m/sec average real*4
SSV vsurf j-surface current m/sec average real*4

daily 2d maximum fields

Diagnostic (MOM6) Diagnostic (MOM5) Description Unit Method Packing
bottom_temp
e_D sea_level effective sea level (eta_t + patm/(rho0*g)) on T cells meter max real*4
tos surface_pot_temp Sea Surface Temperature deg_C max real*4

daily 2d minimum fields

Diagnostic (MOM6) Diagnostic (MOM5) Description Unit Method Packing
tos surface_pot_temp Sea Surface Temperature deg_C min real*4

daily scalar snapshots

Diagnostic (MOM6) Diagnostic (MOM5) Description Unit Method Packing
eta_global global ave eta_t plus patm_t/(g*rho0) meter none real*8
KE ke_tot Globally integrated ocean kinetic energy 10^15 Joules none real*8
pe_tot Globally integrated ocean potential energy 10^15 Joules none real*8
rhoinsitu rhoave global mean ocean in-situ density from ocean_density_mod kg/m^3 none real*8
soga salt_global_ave Global mean salt in liquid seawater psu none real*8
sosga salt_surface_ave Global mass weighted mean surface salt in liquid seawater psu none real*8
thetaoga temp_global_ave Global mean temp in liquid seawater deg_C none real*8
tosga temp_surface_ave Global mass weighted mean surface temp in liquid seawater deg_C none real*8
total_net_sfc_heating total ocean surface flux from coupler and mass transfer Watts/1e15 none real*8
total_ocean_evap_heat total latent heat flux into ocean (<0 cools ocean) Watts/1e15 none real*8
total_ocean_evap total evaporative ocean mass flux (>0 enters ocean) (kg/sec)/1e15 none real*8
total_ocean_fprec_melt_heat total heat flux to melt frozen precip (<0 cools ocean) Watts/1e15 none real*8
total_ocean_fprec total snow falling onto ocean (>0 enters ocean) (kg/sec)/1e15 none real*8
total_ocean_heat Total heat in the liquid ocean referenced to 0degC Joule/1e25 none real*8
total_net_heat_coupler total_ocean_hflux_coupler total surface heat flux passed through coupler Watts/1e15 none real*8
total_ocean_hflux_evap total ocean heat flux from evap transferring water across surface Watts/1e15 none real*8
total_ocean_hflux_prec total ocean heat flux from precip transferring water across surface Watts/1e15 none real*8
total_ocean_lprec total liquid precip into ocean (>0 enters ocean) (kg/sec)/1e15 none real*8
total_ocean_lw_heat total longwave flux into ocean (<0 cools ocean) Watts/1e15 none real*8
total_ocean_melt total liquid water melted from sea ice (>0 enters ocean) (kg/sec)/1e15 none real*8
total_ocean_mh_flux total heat flux into ocean from melting ice (>0 heats ocean) Watts/1e15 none real*8
total_ocean_pme_river total ocean precip-evap+river via sbc (liquid frozen none real*8
total_ocean_river_heat total heat flux into ocean from liquid+solid runoff (<0 cools ocean) Watts/1e15 none real*8
total_ocean_river total liquid river water and calving ice entering ocean kg/sec/1e15 none real*8
total_ocean_runoff_heat total ocean heat flux from liquid river runoff Watts/1e15 none real*8
total_ocean_runoff total liquid river runoff (>0 water enters ocean) (kg/sec)/1e15 none real*8
total_ocean_salt total mass of salt in liquid seawater kg/1e18 none real*8
total_ocean_sens_heat total sensible heat into ocean (<0 cools ocean) Watts/1e15 none real*8
total_ocean_sfc_salt_flux_coupler
total_ocean_swflx_vis total visible shortwave into ocean (>0 heats ocean) Watts/1e15 none real*8
total_ocean_swflx total shortwave flux into ocean (>0 heats ocean) Watts/1e15 none real*8

@aekiss
Copy link
Contributor

aekiss commented Jul 22, 2024

Awesome, thanks @minghangli-uni this is super helpful!

@aekiss
Copy link
Contributor

aekiss commented Jul 22, 2024

For diagnostics with missing description, unit, method and packing, these details are not present in the MOM_diags.txt or in mom5. I am not sure how these diagnostiics are traced. Any thoughts? @aekiss

Some diagnostic names are generated algorithmically, so they don't show up if you search for them in the code. The description and unit should be available from output .nc files, e.g. in /g/data/cj50/access-om2/raw-output/access-om2-01/01deg_jra55v140_iaf/output000/ocean, but it might be easier to use the cookbook tools (cc.querying.get_variables), since the cookbook database has indexed this metadata - see https://cosima-recipes.readthedocs.io/en/latest/Tutorials/Using_Explorer_tools.html#COSIMA-Cookbook-solution

@minghangli-uni
Copy link
Contributor

minghangli-uni commented Jul 22, 2024

Thank you @aekiss !

But I am still not sure about the exact process (algorithmically) of how these diagnostics were generated. While the descriptions and units can be tracked from the output nc files, these are essentially the final output files. Are we expected to have similar diagnostics in MOM6 as well? If so, we may end up tracking similar processes as MOM5.

@minghangli-uni
Copy link
Contributor

Below is a sample of the diag_table generated by diag_table_source.yaml and make_diag_table.py with minor tweaks:

  • Static grids are included in a single netcdf file.
  • Scalar fields are included in a single netcdf file.
  • 2D and 3D fields (daily or monthly) are saved in separate files.
  • We need to discuss which fields require remapped grids.

If every of us is happy with this format, I will generate the first draft.


ACCESS-OM3-025
1900 1 1 0 0 0

static 2d grid data

"access-om3.mom6.h.static", -1, "months", 1, "days", "time"
"ocean_model", "area_t", "area_t", "access-om3.mom6.h.static", "all", "none", "none", 2
"ocean_model", "area_u", "area_u", "access-om3.mom6.h.static", "all", "none", "none", 2

monthly 3d fields

"access-om3.mom6.h.3d.agessc.1.monthly.mean.ym%4yr%2mo", 1, "months", 1, "days", "time", 1, "years"
"ocean_model", "agessc", "agessc", "access-om3.mom6.h.3d.agessc.1.monthly.mean.ym%4yr%2mo", "all", "mean", "none", 2

"access-om3.mom6.h.3d.difvho.1.monthly.mean.ym%4yr%2mo", 1, "months", 1, "days", "time", 1, "years"
"ocean_model", "difvho", "difvho", "access-om3.mom6.h.3d.difvho.1.monthly.mean.ym%4yr%2mo", "all", "mean", "none", 2

monthly 3d squared fields

"access-om3.mom6.h.3d.uo.1.monthly.pow02.ym%4yr%2mo", 1, "months", 1, "days", "time", 1, "years"
"ocean_model", "uo", "uo", "access-om3.mom6.h.3d.uo.1.monthly.pow02.ym%4yr%2mo", "all", "pow02", "none", 2

"access-om3.mom6.h.3d.vo.1.monthly.pow02.ym%4yr%2mo", 1, "months", 1, "days", "time", 1, "years"
"ocean_model", "vo", "vo", "access-om3.mom6.h.3d.vo.1.monthly.pow02.ym%4yr%2mo", "all", "pow02", "none", 2

monthly 2d fields

"access-om3.mom6.h.2d.KHTH_t.1.monthly.mean.ym%4yr%2mo", 1, "months", 1, "days", "time", 1, "years"
"ocean_model", "KHTH_t", "KHTH_t", "access-om3.mom6.h.2d.KHTH_t.1.monthly.mean.ym%4yr%2mo", "all", "mean", "none", 2

"access-om3.mom6.h.2d.KHTR_h.1.monthly.mean.ym%4yr%2mo", 1, "months", 1, "days", "time", 1, "years"
"ocean_model", "KHTR_h", "KHTR_h", "access-om3.mom6.h.2d.KHTR_h.1.monthly.mean.ym%4yr%2mo", "all", "mean", "none", 2

monthly 2d fields with different reduction methods

"access-om3.mom6.h.2d.mlotst.1.monthly.max.ym%4yr%2mo", 1, "months", 1, "days", "time", 1, "years"
"ocean_model", "mlotst", "mlotst", "access-om3.mom6.h.2d.mlotst.1.monthly.max.ym%4yr%2mo", "all", "max", "none", 2

"access-om3.mom6.h.2d.tos.1.monthly.min.ym%4yr%2mo", 1, "months", 1, "days", "time", 1, "years"
"ocean_model", "tos", "tos", "access-om3.mom6.h.2d.tos.1.monthly.min.ym%4yr%2mo", "all", "min", "none", 2

daily 2d fields

"access-om3.mom6.h.2d.tob.1.daily.mean.ym%4yr%2mo", 1, "days", 1, "days", "time", 1, "years"
"ocean_model", "tob", "tob", "access-om3.mom6.h.2d.tob.1.daily.mean.ym%4yr%2mo", "all", "mean", "none", 2

"access-om3.mom6.h.2d.mlotst.1.daily.mean.ym%4yr%2mo", 1, "days", 1, "days", "time", 1, "years"
"ocean_model", "mlotst", "mlotst", "access-om3.mom6.h.2d.mlotst.1.daily.mean.ym%4yr%2mo", "all", "mean", "none", 2

daily 2d maximum fields

"access-om3.mom6.h.2d.tob.1.daily.max.ym%4yr%2mo", 1, "days", 1, "days", "time", 1, "years"
"ocean_model", "tob", "tob", "access-om3.mom6.h.2d.tob.1.daily.max.ym%4yr%2mo", "all", "max", "none", 2

"access-om3.mom6.h.2d.e_D.1.daily.max.ym%4yr%2mo", 1, "days", 1, "days", "time", 1, "years"
"ocean_model", "e_D", "e_D", "access-om3.mom6.h.2d.e_D.1.daily.max.ym%4yr%2mo", "all", "max", "none", 2

daily 2d minimum fields

"access-om3.mom6.h.2d.tos.1.daily.min.ym%4yr%2mo", 1, "days", 1, "days", "time", 1, "years"
"ocean_model", "tos", "tos", "access-om3.mom6.h.2d.tos.1.daily.min.ym%4yr%2mo", "all", "min", "none", 2

daily scalar snapshots

"access-om3.mom6.h.scalar.1.daily.ym%4yr%2mo", 1, "days", 1, "days", "time", 1, "years"
"ocean_model", "soga", "soga", "access-om3.mom6.h.scalar.1.daily.ym%4yr%2mo", "all", "none", "none", 1
"ocean_model", "sosga", "sosga", "access-om3.mom6.h.scalar.1.daily.ym%4yr%2mo", "all", "none", "none", 1

monthly 2d snapshots

monthly 3d snapshots

@dougiesquire
Copy link
Collaborator Author

A couple of comments questions using the following as an example:

"access-om3.mom6.h.3d.agessc.1.monthly.mean.ym%4yr%2mo", 1, "months", 1, "days", "time", 1, "years"
  • I think 1.monthly should be 1monthly (assuming that the 1 is the number of months)
  • Why do we need ym? Can this be dropped?
  • I think the date should reflect the time period of the data in the file. For example, this file includes 1 year of data so the time stamp should be %4yr only

Altogether, this would mean the above changes to:

"access-om3.mom6.h.3d.agessc.1monthly.mean.%4yr", 1, "months", 1, "days", "time", 1, "years"

or if we want to be more concise we could shorten the frequency to be consistent with CMIP6 vocab (though this only really saves a few characters)

"access-om3.mom6.h.3d.agessc.1mon.mean.%4yr", 1, "months", 1, "days", "time", 1, "years"

Thoughts @minghangli-uni, @aekiss?

@minghangli-uni
Copy link
Contributor

@dougiesquire Thanks for the comments.

I am more inclined to this format, "access-om3.mom6.h.3d.agessc.1mon.mean.%4yr", 1, "months", 1, "days", "time", 1, "years"
It looks concise and doesn’t seem to lose any information.

@aekiss
Copy link
Contributor

aekiss commented Jul 26, 2024

ym is just an aesthetic workaround because FMS prepends _ prior to the date, and it looks ugly having that preceded by another separator - see COSIMA/access-om2#185 (comment).

In our case
access-om3.mom6.h.3d.agessc.1mon.mean.%4yr
will produce filenames like
access-om3.mom6.h.3d.agessc.1mon.mean._1900.nc
and the ._ may offend those of a sensitive disposition.

@aekiss
Copy link
Contributor

aekiss commented Jul 26, 2024

I'm not sure that .y_1900 looks a whole lot better than ._1900 though.

@aekiss
Copy link
Contributor

aekiss commented Jul 26, 2024

Other than that, I'm happy with

"access-om3.mom6.h.3d.agessc.1mon.mean.%4yr", 1, "months", 1, "days", "time", 1, "years"

although it would be even better if the nuopc prefix was shortened somehow to omit h and maybe access- too (ditto for cice6 and ww3 outputs).

@aekiss
Copy link
Contributor

aekiss commented Jul 26, 2024

@minghangli-uni I'm happy to look over your diag_table_source.yaml when you've finished.

@minghangli-uni
Copy link
Contributor

I am thinking adding some scripts to make_diag_table.py to include descriptions for each of the diagnostics. This might make the table a bit messy, but it should clarify the diag_table. It’s not urgent, though.

@aekiss
Copy link
Contributor

aekiss commented Jul 31, 2024

I didn't realise FMS will replace other delimiters with underscores. That's weirdly controlling.

It also ignores anything following date notation, so you have no choice but to put the date at the end.

@dougiesquire
Copy link
Collaborator Author

dougiesquire commented Aug 1, 2024

Okay, my goal is to make a call and move forward on this today. We can always change things later.

I think I've come round to the idea of post-processing so that everything is consistent

THIS HAS BEEN UPDATED BELOW

MOM6

Output as
access-om3.mom6.<n_dimension>.<variable_name/descriptor>.<frequency>.<time_cell_method><datestamp>.nc
and post-process away the annoying underscores

E.g.

  • access-om3.mom6.3d.agessc.1day.mean_1900_01.nc
    is post-processed to
    access-om3.mom6.3d.agessc.1day.mean.1900-01.nc

What should we do for multi-variable files that have a range of time cell-methods? Should we just always drop the <time_cell_method> when there are multiple variables as was done for OM2?

CICE6

Output as
access-om3.cice.h.<datestamp>.nc (current cap behaviour)
and post-process to remove .h, concatenate daily files in months, and add <frequency> and <time_cell_method> info. (It may be more robust to change some of these in the cap, rather that post-process).

E.g.

  • access-om3.cice.h.1900-01.nc
    is post-processed to
    access-om3.cice.1mon.mean.1900-01.nc
  • access-om3.cice.h.1900-01-*.nc
    is post-processed to
    access-om3.cice.1day.mean.1900-01.nc

WW3

I'm not really across these outputs, but with nuopc I think the outputs currently look like
access-om3.ww3.hi.YYYY-MM-DD-SSSSS.nc

I don't see why these couldn't be post-processed to a format like
access-om3.ww3.1day.mean.YYYY-MM.nc
but perhaps @ezhilsabareesh8 or @anton-seaice can comment?

@anton-seaice
Copy link
Contributor

anton-seaice commented Aug 1, 2024

CICE6

Output as access-om3.cice.h.<datestamp>.nc (current cap behaviour) and post-process to remove .h, concatenate daily files in months, and add <frequency> and <time_cell_method> info. (It may be more robust to change some of these in the cap, rather that post-process).

E.g.

* `access-om3.cice.h.1900-01.nc`
  is post-processed to
  `access-om3.cice.1mon.mean.1900-01.nc`

* `access-om3.cice.h.1900-01-*.nc`
  is post-processed to
  `access-om3.cice.1day.mean.1900-01.nc`

We can add the frequency & cell method through the namelist option hist_suffix. To remove the .h we need to patch this one line in the driver. Which means we only need to post processing to concatenate daily data into monthly files.

(i.e.
hist_suffix = "1day.mean", "1mon.mean", "x" , "x", "x"
where
histfreq = "d", "m", "x", "x", "x"
)

@marc-white
Copy link

MOM6

What should we do for multi-variable files that have a range of time cell-methods? Should we just always drop the <time_cell_method> when there are multiple variables as was done for OM2?

Is it better to drop the time method completely, or add a special value for this circumstance (e.g. 'mixed')? Either is workable, but not having a missing filename parameter might reduce confusion later on.

@dougiesquire
Copy link
Collaborator Author

Is it better to drop the time method completely, or add a special value for this circumstance (e.g. 'mixed')? Either is workable, but not having a missing filename parameter might reduce confusion later on.

Good question. I wonder if @aekiss has opinions on the this based on experiences with ACCESS-OM2?

@aekiss
Copy link
Contributor

aekiss commented Aug 1, 2024

Looks good, thanks @dougiesquire!

We haven't hit any problems dropping the time method in ACCESS-OM2.

For OM3 I guess it depends on whether we plan to have processing scripts that expect a consistent number of filename components (e.g. `filename.split('.')) so we don't need to handle special cases.

@aekiss
Copy link
Contributor

aekiss commented Aug 1, 2024

access-om3.mom6.3d.agessc.1day.mean_1900_01.nc
is post-processed to
access-om3.mom6.3d.agessc.1day.mean.1900-01.nc

Need to be a bit careful with this because MOM variable name can also include _.

@aekiss
Copy link
Contributor

aekiss commented Aug 1, 2024

<n_dimension>.<descriptor> for multi-variable output all of the same dimensionality (e.g. 1d.scalar)

My suggestion here doesn't make sense - should be 0d.scalar if we're only considering spatial dimensionality.

ACCESS-OM2 just used scalar, but maybe we want 0d.scalar to keep a consistent number of components, e.g. in case we do filename.split('.') and want a list of a consistent length.

@dougiesquire
Copy link
Collaborator Author

I personally think it's good to have some flexibility in the naming scheme. I'm sort of viewing everything between the <component> and <frequency> fields as optional and unconstrained. For example, the CICE and WW3 output has nothing between these fields, e.g. access-om3.cice.1day.mean.1900-01.nc. So I think we could drop <n_dimension> in the case of multi-variable files.

As for the <time_cell_method> with multi-variable files, let's include this for consistency across the components. In many cases (e.g. access-om2 1deg_jra55_ryf) the cell method for all variables will be the same. But when it's not we can use mix.

I'll wait for any disagreements and then write this all up into a single comment.

@aekiss
Copy link
Contributor

aekiss commented Aug 1, 2024

OK I'm happy with that

@minghangli-uni
Copy link
Contributor

minghangli-uni commented Aug 1, 2024

Thank you @dougiesquire .

I prefer to keep the <component> field for all model components (mom6, cice6, ww3). Something like this access-om3.cice6.cice.1mon.mean.1900-01.nc. It might be easier for users to directly recognise the diagnostics from the component they are working with.

For the scalar ncfiles, I agree to keep <time_cell_method> with multiple variable files when all variables share the same time method, otherwise mix.

@dougiesquire
Copy link
Collaborator Author

Something like this access-om3.cice6.cice.1mon.mean.1900-01.nc

@minghangli-uni what's the purpose of having both cice6 and cice?

@minghangli-uni
Copy link
Contributor

I am following the format of this, cice6 is the component, cice is the variable_info, will that be more consistent?

access-om3.<component>.<variable_info>.<frequency>_<time_cell_method>_<date_stamp>.nc

@aekiss
Copy link
Contributor

aekiss commented Aug 1, 2024

cice6.cice seems redundant. I'm happy to omit <n_dimension>.<variable_name/descriptor> for cice and ww3 even though it means the number of filename components differs between mom6 and cice6/ww3.

@aekiss
Copy link
Contributor

aekiss commented Aug 1, 2024

Yet another wrinkle: MOM6 can produce diagnostic output that is remapped onto multiple vertical grids. These need to be distinguished in the filename, because the variable names are unchanged.
It's also possible to output spatially coarsened data.

@dougiesquire
Copy link
Collaborator Author

Yet another wrinkle: MOM6 can produce diagnostic output that is remapped onto multiple vertical grids. These need to be distinguished in the filename, because the variable names are unchanged.
It's also possible to output spatially coarsened data.

Good points @aekiss.

In both of these cases, the diagnostic is included in the diag_table using a modified version of the module_name entry. For example, one might specify diagnostics on a configured density coordinate using a "ocean_model_rho" module_name. SImilarly, down-sampled output can be specified by appending _d2 to the module_name (note only a down-sampling level of 2 is currently supported).

In both of these cases, we can capture the info in the filename by adding the text appended to the module name as optional additional fields:

access-om3.mom6.<n_dimension>.<variable_name/descriptor>.<optional:vertical_coordinate>.<optional:d2>.<frequency>.<time_cell_method><datestamp>.nc

Example:
access-om3.mom6.3d.agessc.rho.d2.1day.mean_1900_01.nc

Are people happy with this?

@aekiss
Copy link
Contributor

aekiss commented Aug 5, 2024

Sounds good. My only concern is that having two optional fields will make it ambiguous to parse if only one of these is used. Should we require the vertical coordinate field, e.g. using nat for native coords instead of omitting it entirely?

@dougiesquire
Copy link
Collaborator Author

My only concern is that having two optional fields will make it ambiguous to parse if only one of these is used. Should we require the vertical coordinate field, e.g. using nat for native coords instead of omitting it entirely?

Yes whoops, good point.

I'll let this sit with people for a bit and then (re)write up everything above into a new post.

@aekiss
Copy link
Contributor

aekiss commented Aug 5, 2024

alternatively, and a little more compactly, could make d2 (downscaled) or d1 (full resolution) mandatory

@dougiesquire
Copy link
Collaborator Author

alternatively, and a little more compactly, could make d2 (downscaled) or d1 (full resolution) mandatory

Unfortunately the vertical coordinate only makes sense for 3D output, and the downsampling only makes sense for min 2D output. So they might both have to be optional?

The d2 field is almost a flag: if it's there it's downsampled, if it's isn't it's not, so parsing shouldn't be too difficult (though hopefully we won't be relying on the filename for anything important!)

@dougiesquire
Copy link
Collaborator Author

dougiesquire commented Aug 7, 2024

Okay here's what I think we've arrived at:

The general scheme:
access-om3.<model>.<optional:model_specific_fields>.<frequency>.<time_cell_method>.<datestamp>.nc
is followed for all models. Where

  • <frequency> follows CMIP6 vocab
  • <datestamp> uses - separator
  • Multi-variable files with more than one <time_cell_method> use <time_cell_method> = mix

MOM6

Model-specific fields: <n_dimension>.<variable_name/descriptor>.<vertical_coordinate>.<d2>

  • <n_dimension> is the number of spatial dimensions, only included for files containing a single variable, e.g. 0d for scalars
  • <vertical_coordinate> is only included for non-native coordinates
  • <d2> is only included for down-sampled output
  • The annoying underscores in the FMS datestamp will be fixed up as a post-processing step

E.g.

  • 3d variable on native vertical coords: access-om3.mom6.3d.agessc.1day.mean.1900-01.nc
  • 3d variable on rho coords: access-om3.mom6.3d.agessc.rho.1day.mean.1900-01.nc
  • 3d variable on native vertical coords and downsampled: access-om3.mom6.3d.agessc.d2.1day.mean.1900-01.nc
  • 3d variable on rho coords and downsampled: access-om3.mom6.3d.agessc.rho.d2.1day.mean.1900-01.nc
  • Multi-variable with more than one <time_cell_method>: access-om3.mom6.scalar.1day.mix.1900-01.nc

It’s not ideal, but I think there are rules we can follow to reliably parse info.

CICE6

No model-specific fields (yet).

E.g.

  • access-om3.cice.1mon.mean.1900-01.nc
  • access-om3.cice.1day.mean.1900-01.nc

WW3

No model-specific fields (yet).

  • Currently outputs access-om3.ww3.hi.YYYY-MM-DD-SSSSS.nc so will need to post-process or patch the NUOPC cap

E.g.

  • access-om3.ww3.1day.snap.1900-01.nc

Speak now or forever hold your peace

@ezhilsabareesh8
Copy link
Contributor

ezhilsabareesh8 commented Aug 7, 2024

I don't see why these couldn't be post-processed to a format like
access-om3.ww3.1day.mean.YYYY-MM.nc
but perhaps @ezhilsabareesh8 or @anton-seaice can comment?

@dougiesquire WW3 cannot to do time averages in output history, refer here, the output access-om3.ww3.hi.YYYY-MM-DD-SSSSS.nc is a snapshot. The mean output can be post-processed to this format access-om3.ww3.1day.mean.YYYY-MM.nc, once we adopt these changes to the wave cap by NorESM.

Regarding changing the access-om3.ww3.hi.YYYY-MM-DD-SSSSS.nc, I think it is difficult to change the output format. The default output of WW3 is a binary file, however when history option is present in the wave_modelio of nuopc.runconfig, WW3 allows user to write a gridded netcdf output file using the w3iogoncd function at a user defined frequency. All these gridded output files are snapshots. However, the default format of WW3 time string isYYYY-MM-DD-SSSSS refer here. Once the history alarm triggers the w3iogoncd function, WW3 calls set_user_timestring function from w3timemd module (refer here), and converts the current time to YYYY-MM-DD-SSSSS format.

@aidanheerdegen
Copy link

I think I've come round to the idea of post-processing so that everything is consistent

Is this still the case? I really don't think this is a good idea if so.

Every post-processing step is another step in the provenance chain. If we can't capture it we're weakening the connections between model run and output data. If we do capture it, then it's just more complexity.

As @dougiesquire pointed out above, any post-processing step also has the potential for corrupting files. A simple mv isn't so bad, but if we're doing other post-processing actions that modify data or meta-data at all there is the potential for errors that can go undetected for a long period of time.

If it's as simple as some output filename formatting I'd definitely prefer to enable in code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
all_configurations cice6 Related to CICE6 help wanted Extra attention is needed mom6 Related to MOM6 priority:high ww3 Related to WW3
Projects
None yet
Development

No branches or pull requests

7 participants