Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
d88b8f1
Add some additional timers around decomp level activity
ekluzek Jun 26, 2025
6cf4b8e
Start adding control for decomp_init self tests and add ability for s…
ekluzek Jun 26, 2025
ae5435b
Merge remote-tracking branch 'escomp/b4b-dev' into add_decomp_init_se…
ekluzek Jun 26, 2025
a27de36
Add namelist controls for self testing
ekluzek Jun 27, 2025
2fd081b
Changes to exit early when self test namelist option used for_testing…
ekluzek Jul 2, 2025
c90d475
Merge branch 'b4b-dev' into add_decomp_init_self_tests
ekluzek Jul 9, 2025
6eaadd4
Bring in the share branch with the memory logger from John Dennis
ekluzek Jul 11, 2025
3a94e8e
Merge remote-tracking branch 'escomp/b4b-dev' into decomp_init_study_…
ekluzek Jul 11, 2025
f18e4b0
Update proc_status_vm to use shr_sys_abort, and iulog from shr_log, a…
ekluzek Jul 11, 2025
71de4c9
Turn off restarts and history and add some timer options as well as t…
ekluzek Jul 11, 2025
efd2129
Fix proc_status_vm from the changes I made, it's now reporting properly
ekluzek Jul 14, 2025
d9e212b
Fix proc_status_vm from the changes I made, it's now reporting properly
ekluzek Jul 14, 2025
213ff9c
Add calls for evaluating memory
ekluzek Jul 14, 2025
0abc15c
Put memory stuff only under masterproc to only report on a single tas…
ekluzek Jul 15, 2025
c1bfd83
Add a PE layout for mpas13p75
ekluzek Jul 29, 2025
3a32519
Start adding timers to lnd_set_decomp_and_domain_from_readmesh
ekluzek Jul 29, 2025
33c8f75
Merge remote-tracking branch 'escomp/b4b-dev' into decomp_init_study_…
ekluzek Jul 29, 2025
d531303
Turn off RTM rather than increase the ROF coupling frequency
ekluzek Jul 30, 2025
402584e
Turn off urban HAC completely and minimize urban in gridcells
ekluzek Jul 30, 2025
31ad846
Add a testmod for mpasa3p75 grid
ekluzek Jul 30, 2025
be81f3a
Add decomp initialization test and test list for ultra high resolutio…
ekluzek Jul 30, 2025
95ab014
Fix syntax and correct 3p75 resolution grid name for test
ekluzek Jul 30, 2025
6ee9077
Still need to set NCPL_ROF
ekluzek Jul 30, 2025
f03a0eb
Fix name of mpasa3p75 testmod in test
ekluzek Jul 30, 2025
ed4c49e
Fix syntax error
ekluzek Jul 30, 2025
b5ab98c
Remove the mpasa15 test from expected fails
ekluzek Jul 30, 2025
8914b12
Add timers for clm_initialize2 that cover the whole subroutine
ekluzek Jul 31, 2025
f294b31
Add another timer within part3, and also turn off some of the history…
ekluzek Jul 31, 2025
1bd2408
Balance check doesn't take time, so adjust the timers again for part3
ekluzek Jul 31, 2025
60bd85e
Add memory checking calls through the lnd_set_decomp_and_domain_from_…
ekluzek Aug 1, 2025
8c5debb
Remove one of the memory checks as it wasn't needed
ekluzek Aug 1, 2025
3f5cff5
Add some timers for clmInstInit
ekluzek Aug 1, 2025
b30d9e0
Combine timers for part3/4/5 as they are all small
ekluzek Aug 1, 2025
373b84c
Add timers for urbantv Init and InitVertical
ekluzek Aug 1, 2025
4f7de29
Add a timer around just the strdata_init
ekluzek Aug 1, 2025
7f03d77
Make an internal subroutine for deallocation inside of lnd_set_decomp…
ekluzek Aug 1, 2025
02f894e
Add release of the ESMF objects in the lnd_set_decomp_and_domain_from…
ekluzek Aug 1, 2025
57b04cd
ESMF tells me that some of these objects are used later and can not b…
ekluzek Aug 1, 2025
8cf101a
Turn on removing all ESMF garbage for the things deleted, and add not…
ekluzek Aug 1, 2025
ccdd13c
Fix XML name for RTM_MODE
ekluzek Aug 1, 2025
03722cd
Call shr_malloc_trim so that memory is released by the OS after the d…
ekluzek Aug 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -99,8 +99,11 @@ fxDONOTUSEurl = https://github.com/ESCOMP/CDEPS.git

[submodule "share"]
path = share
url = https://github.com/ESCOMP/CESM_share
fxtag = share1.1.9
#url = https://github.com/ESCOMP/CESM_share
url = https://github.com/ekluzek/CESM_share
#fxtag = share1.1.9
#fxtag = add_jdennis_procstatus_module
fxtag = 9973692556da54f9562935be43c1d43b0607d24b
fxrequired = ToplevelRequired
# Standard Fork to compare to with "git fleximod test" to ensure personal forks aren't committed
fxDONOTUSEurl = https://github.com/ESCOMP/CESM_share
Expand Down
11 changes: 11 additions & 0 deletions bld/namelist_files/namelist_definition_ctsm.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1242,12 +1242,23 @@ Whether to use subgrid fluxes for snow
Whether snow on the vegetation canopy affects the radiation/albedo calculations
</entry>

<entry id="for_testing_exit_after_self_tests" type="logical" category="default_settings"
group="clm_inparm" >
Whether to exit early after the initialization self tests are run. This is typically only used in automated tests.
</entry>

<entry id="for_testing_run_ncdiopio_tests" type="logical" category="default_settings"
group="clm_inparm" >
Whether to run some tests of ncdio_pio as part of the model run. This is
typically only used in automated tests.
</entry>

<entry id="for_testing_run_decomp_init_tests" type="logical" category="default_settings"
group="clm_inparm" >
Whether to run some tests of decompInit (to get the gridcell to MPI task decomposition) as part of the model run. This is
typically only used in automated tests.
</entry>

<entry id="for_testing_use_second_grain_pool" type="logical" category="default_settings"
group="clm_inparm" >
If true, allocate memory for and use a second crop grain pool. This is
Expand Down
38 changes: 38 additions & 0 deletions cime_config/config_pes.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2092,6 +2092,44 @@
</rootpe>
</pes>
</mach>
</grid>
<grid name="l%mpasa3p75">
<mach name="derecho">
<pes pesize="any" compset="any">
<comment>none</comment>
<ntasks>
<ntasks_atm>-1</ntasks_atm>
<ntasks_lnd>-80</ntasks_lnd>
<ntasks_rof>-80</ntasks_rof>
<ntasks_ice>-80</ntasks_ice>
<ntasks_ocn>-80</ntasks_ocn>
<ntasks_glc>-80</ntasks_glc>
<ntasks_wav>-80</ntasks_wav>
<ntasks_cpl>-80</ntasks_cpl>
<ntasks_lnd>-80</ntasks_lnd>
</ntasks>
<nthrds>
<nthrds_atm>1</nthrds_atm>
<nthrds_lnd>1</nthrds_lnd>
<nthrds_rof>1</nthrds_rof>
<nthrds_ice>1</nthrds_ice>
<nthrds_ocn>1</nthrds_ocn>
<nthrds_glc>1</nthrds_glc>
<nthrds_wav>1</nthrds_wav>
<nthrds_cpl>1</nthrds_cpl>
</nthrds>
<rootpe>
<rootpe_atm>0</rootpe_atm>
<rootpe_lnd>-1</rootpe_lnd>
<rootpe_rof>-1</rootpe_rof>
<rootpe_ice>-1</rootpe_ice>
<rootpe_ocn>-1</rootpe_ocn>
<rootpe_glc>-1</rootpe_glc>
<rootpe_wav>-1</rootpe_wav>
<rootpe_cpl>-1</rootpe_cpl>
</rootpe>
</pes>
</mach>
</grid>
<grid name="l%0.125nldas2">
<mach name="any">
Expand Down
7 changes: 0 additions & 7 deletions cime_config/testdefs/ExpectedTestFails.xml
Original file line number Diff line number Diff line change
Expand Up @@ -363,11 +363,4 @@

<!-- decomp_init test list-->

<test name="SMS_Ln1_PL.mpasa15_mpasa15.I2000Clm45Sp.derecho_intel.clm-run_self_tests">
<phase name="RUN">
<status>FAIL</status>
<issue>#3316</issue>
</phase>
</test>

</expectedFails>
10 changes: 10 additions & 0 deletions cime_config/testdefs/testlist_clm.xml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
matrixcn: Tests exercising the matrix-CN capability
aux_clm_mpi_serial: aux_clm tests using mpi-serial. Useful for redoing tests that failed due to https://github.com/ESCOMP/CTSM/issues/2916, after having replaced libraries/mpi-serial with a fresh copy.
decomp_init: Initialization tests specifically for examining the PE layout decomposition initialization
uhr_decomp_init: Initialization tests at Ultra High Resolution - specifically for examining the PE layout decomposition initialization
-->
<testlist version="2.0">
<test name="ERI_D_Ld9" grid="f10_f10_mg37" compset="I1850Clm60Bgc" testmods="clm/default">
Expand Down Expand Up @@ -4209,6 +4210,15 @@
<option name="comment">Initialization test for mpasa15 with "Large" layout</option>
</options>
</test>
<test name="SMS_Ln1" grid="mpasa3p75_mpasa3p75_mt13" compset="I2000Clm45Sp" testmods="clm-run_self_tests--clm-mpasa3p75">
<machines>
<machine name="derecho" compiler="intel" category="uhr_decomp_init"/>
</machines>
<options>
<option name="wallclock">0:15:00</option>
<option name="comment">Initialization test for mpasa3p75 default layout</option>
</options>
</test>

<test name="LGRAIN2_Ly1_P128x1" grid="f10_f10_mg37" compset="I1850Clm50BgcCrop" testmods="clm/ciso--clm/cropMonthOutput">
<machines>
Expand Down
6 changes: 6 additions & 0 deletions cime_config/testdefs/testmods_dirs/clm/mpasa3p75/user_nl_clm
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
! Settings currently required to run at the mpasa3p75 grid
! urbantv files at that resolution and use a redistribution mapping

stream_fldfilename_urbantv = '/glade/derecho/scratch/bdobbins/ko/tbuildmax.nc'
stream_meshfile_urbantv = '/glade/derecho/scratch/bdobbins/ko/mesh.nc'
urbantvmapalgo = 'redist'
9 changes: 7 additions & 2 deletions cime_config/testdefs/testmods_dirs/clm/run_self_tests/README
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
The purpose of this testmod directory is to trigger the runtime
self-tests. This runs a suite of unit/integration tests.
The purpose of this testmod directory is to trigger runtime
initialization self-tests. This runs a set of unit/integration tests
that apply at initialization.

We use cold start so that we can get through initialization faster,
since how we initialize the model is unimportant for these self-tests.
We also exit as early as possible to minimize the time spent.

There are other self_tests that need to be exercised in the model time stepping
and are done outside of these.
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,25 @@
# We use this testmod in a _Ln1 test; this requires forcing the ROF coupling frequency to same frequency as DATM
./xmlchange ROF_NCPL='$ATM_NCPL'

# Turn off ROF model when used with compsets that have them
./xmlchange RTM_MODE='NULL'

# Turn MEGAN off to run faster
./xmlchange CLM_BLDNML_OPTS='--no-megan' --append

# Use fast structure and NWP configuration for speed
./xmlchange CLM_STRUCTURE="fast"
./xmlchange CLM_CONFIGURATION="nwp"

# Turn cpl history off
./xmlchange HIST_OPTION="never"
# Restarts aren't allowed for these tests, and turn off CPL history
# First change in env_test.xml, then in the standard one so it won't complain there
./xmlchange --force REST_OPTION="never" --file env_test.xml
./xmlchange --force HIST_OPTION="never" --file env_test.xml
./xmlchange REST_OPTION="never"
./xmlchange HIST_OPTION="never"

# Timer settings
./xmlchange TIMER_DETAIL="2"
./xmlchange SAVE_TIMING="TRUE"
./xmlchange CHECK_TIMING="TRUE"
./xmlchange ESMF_PROFILING_LEVEL="10"
Original file line number Diff line number Diff line change
@@ -1 +1,9 @@
for_testing_run_ncdiopio_tests = .true.
for_testing_run_ncdiopio_tests = .false.
for_testing_run_decomp_init_tests = .true.
for_testing_exit_after_self_tests = .true.

! Turn off history, restarts, and output
hist_empty_htapes = .true.
use_noio = .true.
urban_hac = 'OFF'
toosmall_urban = 98.0d00 ! Minimize urban in gridcells
2 changes: 1 addition & 1 deletion share
Submodule share updated 1 files
+265 −0 src/proc_status_vm.F90
27 changes: 26 additions & 1 deletion src/cpl/nuopc/lnd_comp_nuopc.F90
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ module lnd_comp_nuopc
use clm_varctl , only : single_column, clm_varctl_set, iulog
use clm_varctl , only : nsrStartup, nsrContinue, nsrBranch
use clm_varctl , only : FL => fname_len
use clm_varctl , only : for_testing_exit_after_self_tests
use clm_time_manager , only : set_timemgr_init, advance_timestep
use clm_time_manager , only : update_rad_dtime
use clm_time_manager , only : get_nstep, get_step_size
Expand Down Expand Up @@ -80,6 +81,7 @@ module lnd_comp_nuopc

logical :: glc_present
logical :: rof_prognostic
logical :: atm_present
logical :: atm_prognostic
integer, parameter :: dbug = 0
character(*),parameter :: modName = "(lnd_comp_nuopc)"
Expand Down Expand Up @@ -284,6 +286,11 @@ subroutine InitializeAdvertise(gcomp, importState, exportState, clock, rc)
else
atm_prognostic = .true.
end if
if (trim(atm_model) == 'satm') then
atm_present = .false.
else
atm_present = .true.
end if
call NUOPC_CompAttributeGet(gcomp, name='GLC_model', value=glc_model, rc=rc)
if (ChkErr(rc,__LINE__,u_FILE_u)) return
if (trim(glc_model) == 'sglc') then
Expand All @@ -310,6 +317,9 @@ subroutine InitializeAdvertise(gcomp, importState, exportState, clock, rc)
write(iulog,'(a )')' rof component = '//trim(rof_model)
write(iulog,'(a )')' glc component = '//trim(glc_model)
write(iulog,'(a,L2)')' atm_prognostic = ',atm_prognostic
if (.not. atm_present) then
write(iulog,'(a,L2)')' atm_present = ',atm_present
end if
write(iulog,'(a,L2)')' rof_prognostic = ',rof_prognostic
write(iulog,'(a,L2)')' glc_present = ',glc_present
if (glc_present) then
Expand All @@ -328,7 +338,8 @@ subroutine InitializeAdvertise(gcomp, importState, exportState, clock, rc)
call control_setNL("lnd_in"//trim(inst_suffix))


call advertise_fields(gcomp, flds_scalar_name, glc_present, cism_evolve, rof_prognostic, atm_prognostic, rc)
call advertise_fields(gcomp, flds_scalar_name, glc_present, cism_evolve, rof_prognostic, &
atm_prognostic, atm_present, rc)
if (ChkErr(rc,__LINE__,u_FILE_u)) return

!----------------------------------------------------------------------------
Expand Down Expand Up @@ -492,6 +503,12 @@ subroutine InitializeRealize(gcomp, importState, exportState, clock, rc)
else
single_column = .false.
end if
if ( for_testing_exit_after_self_tests) then
! *******************
! *** RETURN HERE ***
! *******************
RETURN
end if

!----------------------------------------------------------------------------
! Reset shr logging to my log file
Expand Down Expand Up @@ -771,6 +788,9 @@ subroutine ModelAdvance(gcomp, rc)
if (single_column .and. .not. scol_valid) then
RETURN
end if
if (for_testing_exit_after_self_tests) then
RETURN
end if

!$ call omp_set_num_threads(nthrds)

Expand Down Expand Up @@ -1002,6 +1022,7 @@ subroutine ModelSetRunClock(gcomp, rc)
rc = ESMF_SUCCESS
call ESMF_LogWrite(subname//' called', ESMF_LOGMSG_INFO)
if (.not. scol_valid) return
if (for_testing_exit_after_self_tests) return

! query the Component for its clocks
call NUOPC_ModelGet(gcomp, driverClock=dclock, modelClock=mclock, rc=rc)
Expand Down Expand Up @@ -1285,6 +1306,7 @@ subroutine clm_orbital_update(clock, logunit, mastertask, eccen, obliqr, lambm0
end subroutine clm_orbital_update

subroutine CheckImport(gcomp, rc)
use clm_varctl, only : for_testing_exit_after_self_tests
type(ESMF_GridComp) :: gcomp
integer, intent(out) :: rc
character(len=*) , parameter :: subname = "("//__FILE__//":CheckImport)"
Expand Down Expand Up @@ -1313,6 +1335,9 @@ subroutine CheckImport(gcomp, rc)
if (single_column .and. .not. scol_valid) then
RETURN
end if
if (for_testing_exit_after_self_tests) then
RETURN
end if
! The remander of this should be equivalent to the NUOPC internal routine
! from NUOPC_ModeBase.F90

Expand Down
19 changes: 14 additions & 5 deletions src/cpl/nuopc/lnd_import_export.F90
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,8 @@ module lnd_import_export
contains
!===============================================================================

subroutine advertise_fields(gcomp, flds_scalar_name, glc_present, cism_evolve, rof_prognostic, atm_prognostic, rc)
subroutine advertise_fields(gcomp, flds_scalar_name, glc_present, cism_evolve, rof_prognostic, &
atm_prognostic, atm_present, rc)

use shr_carma_mod , only : shr_carma_readnl
use shr_ndep_mod , only : shr_ndep_readnl
Expand All @@ -173,6 +174,7 @@ subroutine advertise_fields(gcomp, flds_scalar_name, glc_present, cism_evolve, r
logical , intent(in) :: cism_evolve
logical , intent(in) :: rof_prognostic
logical , intent(in) :: atm_prognostic
logical , intent(in) :: atm_present
integer , intent(out) :: rc

! local variables
Expand Down Expand Up @@ -210,7 +212,9 @@ subroutine advertise_fields(gcomp, flds_scalar_name, glc_present, cism_evolve, r

! Need to determine if there is no land for single column before the advertise call is done

if (atm_prognostic .or. force_send_to_atm) then
if (.not. atm_present)then
send_to_atm = .false.
else if (atm_prognostic .or. force_send_to_atm) then
send_to_atm = .true.
else
send_to_atm = .false.
Expand Down Expand Up @@ -253,12 +257,11 @@ subroutine advertise_fields(gcomp, flds_scalar_name, glc_present, cism_evolve, r
if (shr_megan_mechcomps_n .ne. megan_nflds) call shr_sys_abort('ERROR: megan field count mismatch')

! CARMA volumetric soil water from land
call shr_carma_readnl('drv_flds_in', carma_fields)

! export to atm
call fldlist_add(fldsFrLnd_num, fldsFrlnd, trim(flds_scalar_name))
call fldlist_add(fldsFrLnd_num, fldsFrlnd, 'Sl_lfrin')
if (send_to_atm) then
call fldlist_add(fldsFrLnd_num, fldsFrlnd, 'Sl_lfrin')
call fldlist_add(fldsFrLnd_num, fldsFrlnd, Sl_t )
call fldlist_add(fldsFrLnd_num, fldsFrlnd, Sl_tref )
call fldlist_add(fldsFrLnd_num, fldsFrlnd, Sl_qref )
Expand Down Expand Up @@ -339,6 +342,9 @@ subroutine advertise_fields(gcomp, flds_scalar_name, glc_present, cism_evolve, r

call fldlist_add(fldsToLnd_num, fldsToLnd, trim(flds_scalar_name))

!!!!!!!!!!!!!!!!!!!!!!!!!!! new if section !!!!!!!!!!!!!!!!!!!!!!!!!!
if ( atm_present ) then

! from atm
call fldlist_add(fldsToLnd_num, fldsToLnd, Sa_z )
call fldlist_add(fldsToLnd_num, fldsToLnd, Sa_topo )
Expand Down Expand Up @@ -389,6 +395,9 @@ subroutine advertise_fields(gcomp, flds_scalar_name, glc_present, cism_evolve, r
call fldlist_add(fldsToLnd_num, fldsToLnd, Sa_co2diag)
end if

end if ! atm_present
!!!!!!!!!!!!!!!!!!!!!!!!!!! new if section !!!!!!!!!!!!!!!!!!!!!!!!!!

if (rof_prognostic) then
! from river
call fldlist_add(fldsToLnd_num, fldsToLnd, Flrr_flood )
Expand Down Expand Up @@ -773,14 +782,14 @@ subroutine export_fields( gcomp, bounds, glc_present, rof_prognostic, &
! output to mediator
! -----------------------

if (send_to_atm) then
call state_setexport_1d(exportState, Sl_lfrin, ldomain%frac(begg:), init_spval=.false., rc=rc)
if (ChkErr(rc,__LINE__,u_FILE_u)) return

! -----------------------
! output to atm
! -----------------------

if (send_to_atm) then
call state_setexport_1d(exportState, Sl_t , lnd2atm_inst%t_rad_grc(begg:), &
init_spval=.true., rc=rc)
if (ChkErr(rc,__LINE__,u_FILE_u)) return
Expand Down
7 changes: 7 additions & 0 deletions src/cpl/share_esmf/UrbanTimeVarType.F90
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ module UrbanTimeVarType
use clm_varcon , only : spval
use LandunitType , only : lun
use GridcellType , only : grc
use perf_mod , only : t_startf, t_stopf
!
implicit none
private
Expand Down Expand Up @@ -143,6 +144,8 @@ subroutine urbantv_init(this, bounds, NLFilename)
stream_meshfile_urbantv, &
urbantv_tintalgo

call t_startf("urbantv_init")

! Default values for namelist
stream_year_first_urbantv = 1 ! first year in stream to use
stream_year_last_urbantv = 1 ! last year in stream to use
Expand Down Expand Up @@ -195,6 +198,7 @@ subroutine urbantv_init(this, bounds, NLFilename)
endif

! Initialize the cdeps data type this%sdat_urbantv
call t_startf("str_data_init")
call shr_strdata_init_from_inline(this%sdat_urbantv, &
my_task = iam, &
logunit = iulog, &
Expand All @@ -219,6 +223,9 @@ subroutine urbantv_init(this, bounds, NLFilename)
if (ESMF_LogFoundError(rcToCheck=rc, msg=ESMF_LOGERR_PASSTHRU, line=__LINE__, file=__FILE__)) then
call ESMF_Finalize(endflag=ESMF_END_ABORT)
end if
call t_stopf("str_data_init")

call t_stopf("urbantv_init")

end subroutine urbantv_init

Expand Down
Loading
Loading