-
Couldn't load subscription status.
- Fork 340
Description
Brief summary of bug
The fates test ERP_D_P32x2_Ld3.f19_g17.I2000Clm50FatesCru.cheyenne_intel.clm-FatesColdDef is failing the run due to hitting the wallclock limit when run with ctsm5.1.dev095 and later.
General bug information
CTSM version you are using: ctsm5.1.dev095
Does this bug cause significantly incorrect results in the model's science? No
Configurations affected: ctsm-fates
Details of bug
Failed case can be found here on Cheyenne:
/glade/u/home/glemieux/scratch/ctsm-tests/tests_dev096-baselinegen/ERP_D_P32x2_Ld3.f19_g17.I2000Clm50FatesCru.cheyenne_intel.clm-FatesColdDef.GC.dev096-baselinegen_int
Looking at the logs, it seems like the case gets 'stuck' somewhere in the initialization and then ends hitting the wallclock limit. The cesm.log has the following at the end of the log:
143 28:MOSART decomp info proc = 28 begr = 226801 endr = 234900 numr = 8100
144 29:MOSART decomp info proc = 29 begr = 234901 endr = 243000 numr = 8100
145 30:MOSART decomp info proc = 30 begr = 243001 endr = 251100 numr = 8100
146 31:MOSART decomp info proc = 31 begr = 251101 endr = 259200 numr = 8100
147 /glade/u/apps/ch/opt/mpt/2.22/bin/omplace: line 1: 58410 Terminated dplace -p $placefile "$@"
148 MPT: Received signal 15
Important details of your setup / configuration so we can reproduce the bug
This was tested with fates tag sci.1.57.0_api.23.0.0. I also tested this with dev096 and dev094. The case successfully runs in a reasonable amount time for dev094.
This was first seen in the course of dealing with NGEET/fates#861 (comment) and could be avoided with implementing #1762. In that fates issue it should be noted that the issues persists for this threading configuration in both debug off and on modes.