Skip to content

Rethinking (variables in) h2 files for GDD-generating workflow #3319

@samsrabin

Description

@samsrabin

The GDD-Generating phase of the RXCROPMATURITY test requests two variables, GDDACCUM and GDDHARV, on its h2 tape(s). From ctsm5.3.062:

# Add stuff specific to GDD-Generating run
logger.info("RXCROPMATURITY log: modify user_nl files: generate GDDs")
self._append_to_user_nl_clm(
[
"stream_fldFileName_cultivar_gdds = ''",
"generate_crop_gdds = .true.",
"use_mxmat = .false.",
" ",
"! (h2) Daily outputs for GDD generation and figure-making",
"hist_fincl3 = 'GDDACCUM', 'GDDHARV'",
"hist_nhtfrq(3) = -24",
"hist_mfilt(3) = 365",
"hist_type1d_pertape(3) = 'PFTS'",
"hist_dov2xy(3) = .false.",
]
)

Those variables are used to generate crop growing degree-day requirement files to be used in the second part of the test. That happens via a Python script that, before the split-up of accumulated vs. instantaneous files in ctsm5.3.062, were looking for a file with .h2. in it. From the tag before that:

clm_gdd_var = "GDDACCUM"
my_vars = [clm_gdd_var, "GDDHARV"]
patterns = [f"*h2.{this_year-1}-01*.nc", f"*h2.{this_year-1}-01*.nc.base"]

Both of those variables are time-averaged by default:

this%gddaccum_patch(begp:endp) = spval
call hist_addfld1d (fname='GDDACCUM', units='ddays', &
avgflag='A', long_name='Accumulated growing degree days past planting date for crop', &
ptr_patch=this%gddaccum_patch, default='inactive')

call hist_addfld1d (fname='GDDHARV', units='ddays', &
avgflag='A', long_name='Growing degree days (gdd) needed to harvest', &
ptr_patch=this%gddmaturity_patch, default='inactive')

I thus expected that, for ctsm5.3.062, the Python script would need to be changed to look for .h2a. in the filenames. However, during that work (PR #2445), @slevis-lmwg noticed that didn't solve the problem—only .h2i. worked.

This was pretty confusing until I had a look at the user_nl_clm file. I noticed that, far above the hist_fincl3 = 'GDDACCUM', 'GDDHARV' line requested by RXCROPMATURITY, there was hist_avgflag_pertape(3) = 'I'. This is because the RXCROPMATURITY tests in aux_clm use the cropMonthOutput testmod, which inherits from crop, which has this in its user_nl_clm:

! Instantaneous crop variables (including per-sowing/per-harvest axes), per PFT.
! Note that, under normal circumstances, these should only be saved annually.
! That's needed for the mxsowings and mxharvests axes to make sense.
! However, for testing purposes, it makes sense to save more frequently.
hist_fincl3 = 'SDATES', 'SDATES_PERHARV', 'SYEARS_PERHARV', 'HDATES', 'GRAINC_TO_FOOD_PERHARV', 'GRAINC_TO_FOOD_ANN', 'GRAINN_TO_FOOD_PERHARV', 'GRAINN_TO_FOOD_ANN', 'GRAINC_TO_SEED_PERHARV', 'GRAINC_TO_SEED_ANN', 'GRAINN_TO_SEED_PERHARV', 'GRAINN_TO_SEED_ANN', 'HDATES', 'GDDHARV_PERHARV', 'GDDACCUM_PERHARV', 'HUI_PERHARV', 'SOWING_REASON_PERHARV', 'HARVEST_REASON_PERHARV', 'SWINDOW_STARTS', 'SWINDOW_ENDS', 'GDD20_BASELINE', 'GDD20_SEASON_START', 'GDD20_SEASON_END'
hist_nhtfrq = -24,-8,-24
hist_mfilt = 1,1,1
hist_type1d_pertape(3) = 'PFTS'
hist_avgflag_pertape(3) = 'I'

Initially, I thought I would just add another line to the RXCROPMATURITY GDD-Generating phase user_nl_clm changing it back:

hist_avgflag_pertape(3) = 'A'

Then I realized that it might actually be better for GDDACCUM to be instantaneous in the GDD-generating workflow, to represent the accumulated growing degree-days from planting to the end of the output timestep (rather than the average of that value for all model timesteps in the output timestep). It doesn't make much difference for this test, since the outputs are daily, and anyway we don't care about the scientific correctness for a pass/fail test. But we do care about the correctness of the real workflow, even though the difference would probably be small there too.

THEN I realized that GDDHARV (growing degree-days required for the crop to reach maturity) is in the exact same situation. It's useful for this workflow to have it synced with GDDACCUM, and thus it should be instantaneous too.

(I went on a long sojourn where I thought maybe these should be instantaneous by default, but I realized it's really just this uncommon workflow where instantaneous is useful.)

So here's what I think I should do:

  1. Test setting hist_avgflag_pertape(3) = 'I' vs. 'A' in real GDD-generating runs (not just in this test). Does it break things? It should make just a slight difference. The difference is slight; see Investigate: Does forcing instantaneous h2 files break GDD generation? #3320.

If it's a big difference, that bears more investigation. If it's only a slight difference:

  1. Change the default GDD requirement files for CTSM to use the I results, because I'm pretty sure I didn't set hist_avgflag_pertape(3) = 'I' before. (I also never generated GDD requirements based on CRU-JRA! So this will serve that purpose as well.)
  2. Add an explicit hist_avgflag_pertape(3) = 'I' in user_nl_clm for the GDD-Generating phase of RXCROPMATURITY tests (to avoid relying on the inheritance from the crop testdef), and add comments in various places explaining it.
  3. Update the documentation of the GDD-generating workflow to include that line as well.
  4. Try deleting hist_avgflag_pertape(3) = 'I' from the crop testdef's user_nl_clm (after Fix string replacements in lreprstruct test #3314 is merged).

Sub-issues

Metadata

Metadata

Assignees

Labels

code healthimproving internal code structure to make easier to maintain (sustainability)documentationadditions or edits to user-facing documentation or its infrastructureinvestigationNeeds to be verified and more investigation into what's going on.

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions