Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

B1850 compsets are failing cesm2_3_alpha17d and cesm2_3_alpha17e #2520

Closed
ekluzek opened this issue May 3, 2024 · 6 comments · Fixed by #2501
Closed

B1850 compsets are failing cesm2_3_alpha17d and cesm2_3_alpha17e #2520

ekluzek opened this issue May 3, 2024 · 6 comments · Fixed by #2501
Assignees
Labels
bug something is working incorrectly priority: Immediate Highest priority, something that was unexpected

Comments

@ekluzek
Copy link
Collaborator

ekluzek commented May 3, 2024

Brief summary of bug

@fischer-ncar found tests such as SMS_Ld7.f09_g17.B1850.derecho_intel.allactive-defaultio are failing in cesm2_3_alpha17d and cesm2_3_alpha17e because the finidat files need to have use_init_interp=TRUE and for some reason this isn't happening.

General bug information

CTSM version you are using: ctsm5.2.0

Does this bug cause significantly incorrect results in the model's science? No

Configurations affected: f09 @ B1850

Details of bug

Important details of your setup / configuration so we can reproduce the bug

In cesm2_3_alpha17d and cesm2_3_alpha17e

Important output or errors that show the problem

[dec1768.hsn.de.hpc.ucar.edu](http://dec1768.hsn.de.hpc.ucar.edu/) 393:  check_dim_size ERROR: mismatch of input dimension        50591
[dec1768.hsn.de.hpc.ucar.edu](http://dec1768.hsn.de.hpc.ucar.edu/) 393:   with expected value        48292  for variable landunit
[dec1768.hsn.de.hpc.ucar.edu](http://dec1768.hsn.de.hpc.ucar.edu/) 393: Did you mean to set use_init_interp = .true. in user_nl_clm?
[dec1768.hsn.de.hpc.ucar.edu](http://dec1768.hsn.de.hpc.ucar.edu/) 393: (Setting use_init_interp = .true. is needed when doing a
[dec1768.hsn.de.hpc.ucar.edu](http://dec1768.hsn.de.hpc.ucar.edu/) 393: transient run using an initial conditions file from a non-transient run,
[dec1768.hsn.de.hpc.ucar.edu](http://dec1768.hsn.de.hpc.ucar.edu/) 393: or a non-transient run using an initial conditions file from a transient run,
[dec1768.hsn.de.hpc.ucar.edu](http://dec1768.hsn.de.hpc.ucar.edu/) 393: or when running a resolution or configuration that differs from the initial conditions.)
[dec1768.hsn.de.hpc.ucar.edu](http://dec1768.hsn.de.hpc.ucar.edu/) 393:  ERROR:
[dec1768.hsn.de.hpc.ucar.edu](http://dec1768.hsn.de.hpc.ucar.edu/) 393:  ERROR in /glade/u/home/fischer/code/cesm2_3_alpha17e/components/clm/src/main/nc
[dec1768.hsn.de.hpc.ucar.edu](http://dec1768.hsn.de.hpc.ucar.edu/) 393:  [dio_pio.F90.in](http://dio_pio.f90.in/) at line 409
@ekluzek ekluzek added the bug something is working incorrectly label May 3, 2024
@ekluzek ekluzek added this to the cesm2_3_beta17 milestone May 3, 2024
@ekluzek ekluzek self-assigned this May 3, 2024
@ekluzek
Copy link
Collaborator Author

ekluzek commented May 3, 2024

The B1850 compset used in the test is: 1850_CAM60_CLM50%BGC-CROP_CICE_POP2%ECO_MOSART_CISM2%GRIS-NOEVOLVE_WW3_SESP_BGC%BDRD

user_nl_clm

 hist_dov2xy    = .true.
 hist_ndens     = 1
 hist_nhtfrq    =-24
 hist_mfilt     = 1

It also does something similar for MOSART and RTM, which shouldn't matter. Other user_nl_* settings shouldn't matter. The CLM_ XML settings also seem to be fine.

There are no f09 tests for 1850Clm50BgcCropG in aux_clm, also no tests for it in ctsm_sci. The tests that are in aux_clm:

ERP_D_Ld5.f10_f10_mg37.I1850Clm50BgcCropG.derecho_gnu.clm-glcMEC_changeFlags
ERP_P256x2_D_Ld5.f19_g17_gris4.I1850Clm50BgcCropG.derecho_intel.clm-glcMEC_increase
ERS_D_Ld12.f10_f10_mg37.I1850Clm50BgcCropG.derecho_intel.clm-glcMEC_spunup_inc_dec_bgc

(but none of those might catch a problem for f09)

Tests that I DO THINK should have caught this:

LII2FINIDATAREAS_D_P256x2_Ld1.f09_g17.I1850Clm50BgcCrop.derecho_intel.clm-default
SMS_Ld2_D_PS.f09_g17.I1850Clm50BgcCropCmip6.derecho_intel.clm-basic_interp (explicitly turns interp on)
SMS_Ld3.f09_g17.I1850Clm50BgcCropCru.derecho_intel.clm-default (failed in build though)
SSP_Ld4.f09_g17.I1850Clm50BgcCrop.derecho_intel.clm-ciso_rtmColdSSP (failed in build though)

The first two passed, and the last two failed in the build step, and the second explicitly turns interp on. So the only one to catch it would be the first one:

The unexpected differences in the lnd_in namelist that I see are:

<  finidat = '/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/initdata_map/clmi.B1850Clm50BgcCrop.0161-01-01.0.9x1.25_gx1v7_simyr1850_c200729.nc'
---
>  finidat = '/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/initdata_esmf/ctsm5.2/clmi.I1850Clm50BgcCrop-ciso.1366-01-01.0.9x1.25_gx1v7_simyr1850_c240223.nc'
20c21
<  glc_do_dynglacier = .true.
---
>  glc_do_dynglacier = .false.

So the B1850 case is using an incorrect finidat file, and needs to be updated to use the ctsm5.2.0 one. I think that might be in CESM rather than in CTSM, but I'll need to track that for sure. I'm also perplexed by the dynglacier setting.

@ekluzek
Copy link
Collaborator Author

ekluzek commented May 4, 2024

The glc_do_dynglacier setting is expected because of the compset with active CISM. It looks like there are bunch of settings in namelist_defaults that need to have the

use_init_interp=".true."

attribute added in.

In terms of testing it turns out there are very few tests that are being done that don't need use_init_interp=TRUE. There are only finidat files that could do that for f09 and f19 as well as a few SE grids (ne0np4.*.ne30x8, and ne120np4.pg3). I think those are only being tested for clm6_0 physics. The ctsm_sci test list should probably do more tests at clm5_0 because of this sort of thing.

@ekluzek
Copy link
Collaborator Author

ekluzek commented May 4, 2024

Oh, the reason that the finidat files do NOT match is because of the LND_TUNING_MODE. And that's part of the problem here that we should have some more tests for f09 and f19 that use LND_TUNING_MODE set to CAM6 (and for a few different CLM physics options). So I think everything is back to making sense for me now...

@ekluzek
Copy link
Collaborator Author

ekluzek commented May 6, 2024

An important note here is that one issue here is that B and F compsets are testing with clm5_0 and NOT clm5_1. clm5_1 is what will be used for the coupled simulations in this tag. As such these fails do NOT necessarily need to hold up the cesm2_3_beta17 tag.

@wwieder we've had some email about this with @briandobbins and @fischer-ncar. Brian is also going to ask @dlawrenncar.

@ekluzek ekluzek modified the milestones: cesm2_3_beta17, cesm2_3_beta18 May 6, 2024
@ekluzek ekluzek added the priority: Immediate Highest priority, something that was unexpected label May 6, 2024
@ekluzek
Copy link
Collaborator Author

ekluzek commented May 6, 2024

Manually examining the CAM testlist I see the following...

<test compset="F2000climo" grid="f09_f09_mg17" name="ERP_Ln9" testmods="cam/outfrq9s"> CLM50%SP CAM60
  <test compset="F2010climo" grid="f09_f09_mg17" name="ERP_Ln9" testmods="cam/outfrq9s"> CLM50%SP CAM60
  <test compset="FHIST" grid="f09_f09_mg17" name="ERP_Ln9" testmods="cam/outfrq9s"> CLM50%BGC-CROP CAM60
  <test compset="FHIST_BGC" grid="f09_f09_mg17" name="ERP_Ln9" testmods="cam/outfrq9s"> CLM50%SP CAM60
  <test compset="FHIST_BDRD" grid="f09_f09_mg17" name="ERP_Ln9" testmods="cam/outfrq9s"> CLM50%BGC-CROP CAM60 (with CESM_BDRD) which is already turned on for CLM IHist compsets
    (IHist compsets already run with diagnostic CO2 so fulfil this one)
  <test compset="F1850" grid="f09_f09_mg17" name="ERP_Ln9" testmods="cam/outfrq9s"> CLM50%SP CAM60
  <test compset="FHIST" grid="f19_f19_mg17" name="SMS_Ln9" testmods="cam/outfrq9s_nochem"> CLM50%SP CAM60
WACCM supported compsets
Looks like WACCM doesn't add anything surprising here that isn't taken into account from above...
    <alias>FWHIST</alias>
    <alias>FWHIST_BGC</alias>
    <alias>FWsc2010climo</alias>
    <alias>FWsc2000climo</alias>
    <alias>FWsc1850</alias>
    <alias>FWscHIST</alias>
    <alias>FW1850</alias>
<test compset="FWHIST_BGC" grid="f09_f09_mg17" name="ERP_Ld3" testmods="cam/reduced_hist1d"> Same as FHIST_BGC for CLM
<test compset="FWHIST" grid="f09_f09_mg17" name="ERP_Ld3" testmods="cam/outfrq1d"> Same as FHIST_BGC for CLM
<test compset="FWsc2010climo" grid="f19_f19_mg17" name="SMS_D_Ln9" testmods="cam/outfrq9s"> Same as F2010climo for CLM
<test compset="FWsc2000climo" grid="f10_f10_mg37" name="ERP_Ld3" testmods="cam/outfrq1d_14dec"> Same as F2000climo for CLM
<test compset="FWsc1850" grid="f09_f09_mg17" name="SMS_D_Ln9" testmods="cam/outfrq9s"> Same as F1850 for CLM
<test compset="FWscHIST" grid="f09_f09_mg17" name="SMS_D_Ln9" testmods="cam/outfrq9s"> Same as FHIST_BGC for CLM
<test compset="FW1850" grid="f09_f09_mg17" name="SMS_Ld1" testmods="cam/outfrq1d"> Same as F1850 for CLM

Hence, for CAM testing right now we need I1850Clm50Sp(for CAM6), I2000Clm50Sp (for CAM6), I2010Clm50Sp (for CAM6), IHistClm50BgcCrop (for CAM6), and IHistClm50BgcCrop (for CAM6). And the science supported resolutions are f09 and f19. Also note that WACCM uses *_NCPL of 288, which shouldn't be a problem in our testing, but also something
that we might need to ensure is in our testing.

Also note that CAM FHist compsets start at a variety of different years. We don't necessarily need to start at all these different years, but we do need to make sure they will work.

	<value  compset="HIST_CAM">1979-01-01</value>
	<value  compset="HIST_CAM60%WCTS_CLM50%BGC-CROP">1950-01-01</value>
        <value  compset="HIST_CAM40%WX">2000-01-01</value>
	<value  compset="HIST_CAM60%WCMD">2005-01-01</value>
	<value  compset="HIST_CAM60%WCMD%SDYN" grid="a%1.9x2.5">1980-01-01</value>
	<value  compset="HIST_CAM60%WCSC">1850-01-01</value>
	<value  compset="HIST_CAM60%CCTS[12]">2010-01-01</value>
	<value  compset="HIST_CAM60%GEOSCHEM">2015-01-01</value>
	<value  compset="HIST_CAM60%CCTS[12]" grid="a%ne0np4CONUS">2013-01-01</value>
	<value  compset="HIST_CAM60%CVBSX">1995-01-01</value>
	<value  compset="HIST_CAM60%CFIRE">1995-01-01</value>

Since, HIST compsets have to use use_init_interp=.true. these later startup years should be fine. As long as I1850 and IHist work. So the above compset list is still OK. When #2498 comes in, we'll need to reassess this a bit.

Once, CAM is updated to use CLM60 physics in it's testlist we should update tests to work with clm6_0_cam7.0. Also I assume that ne30np4.pg3 will be a required science support resolution for CAM.

@ekluzek
Copy link
Collaborator Author

ekluzek commented May 6, 2024

For CESM testing CLM51%BGC-CROP and CLM50%BGC-CROP (for cam6.0) at 1850 and HIST compsets are important. So IHistClm50BgcCropG, IHistClm51BgcCropG, I1850Clm50BgcCropG, I1850Clm51BgcCropG at f09, f19, ne30pg3_t061, and ne30pg3_t232 cover current testing. And updating to clm6_0_cam7.0 when CESM does is also needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something is working incorrectly priority: Immediate Highest priority, something that was unexpected
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants