restart reproducibility issue for global fv3 runs #272

junwang-noaa · 2020-11-12T03:50:38Z

Description

It is found that the global fv3 (including the regression test cases) can not reproduce in restart cases when model restarts from fh=24hrs (e.g. control runs for 48hr, restart runs from 24hr to 48hr, results at fh=48hr from control and restart are different). However the code does reproduces when restart starts within 24 hrs and the results are compared at fh<=24. (e.g. control 24hr, restart 12hr->24hr,results at fh=24hr are identical), results do not reproduce when compared at longer forecast time than 24hr. E.g. control runs 36hr, restart runs from 12hr to 36hr, results from control and restart are identical at 24hr, but not at fh=36hr.

To Reproduce:

The issue can be reproduced on all the supported platforms including hera, orion, wcoss.

Check out the code, and run fv3 control regression test for 48hr with restart interval set to 24 in model_configure
Copy the run directory, remove all the output files (dynf*, phyf*, logf*). Copy the restart files at 24 hr in 1) to the input directory in 2), rename those files without the date in the file name.
Change following namelist variable:
warm_start=.true.
nggps_ic=.false.
external_ic=.false.
mountain=.true.
make_nh=.false.
na_init=0
if model is running with cold start having nst spin up (nstf_name (2)=1), the nstf_name(2) should be turned off for restart. (nstf_name(2)=0) e.g.: nstf_name=2,0,1,0,5
submit the job. then compare output files with those created in 1)

climbfuji · 2020-11-12T04:02:16Z

This must be something recent! The FV3_GSD_v0 suite passes exactly this test with the code in https://github.com/NOAA-GSL/ufs-weather-model (default branch is gsd/develop), it's run every time we make a commit (see tests/rt_ccpp_gsd.conf). The last commit we merged from ufs-community / ufs-weather-model into this branch is from October 1:

commit 208f36dfa7e13be18967c60cca01a64ca02de4c7
Author: Dom Heinzeller <dom.heinzeller@icloud.com>
Date:   Thu Oct 1 06:16:09 2020 -0600

    CCPP tendencies bugfixes, global restart reproducibility, halo boundary update in dycore (#208)

You should be able to go back to this hash and get b4b reproducible results in develop. Unless it's something with the suite you are using that doesn't wreck havoc for the FV3_GSD_v0 suite.

SMoorthi-emc · 2020-11-12T14:46:02Z

I made three runs this morning on wcoss/dell with FV3/MOM6/CICE5/CCPP with a suite equivalent to GFSV16 with NSST on and "frac_grid=.false." Run 1 is for 48 hours. Run2 is for 24 hours, writing restarts every 12 hours Run 3 is to restart at 12 hours and continue to 48 hours. Outputs from Run3 are identical to Run 1. The only issue I have is with "frac_grid=.true." Moorthi

…

On Wed, Nov 11, 2020 at 11:02 PM Dom Heinzeller ***@***.***> wrote: This must be something recent! The FV3_GSD_v0 suite passes exactly this test with the code in https://github.com/NOAA-GSL/ufs-weather-model (default branch is gsd/develop), it's run every time we make a commit (see tests/rt_ccpp_gsd.conf). The last commit we merged from ufs-community / ufs-weather-model into this branch is from October 1: commit 208f36d Author: Dom Heinzeller ***@***.***> Date: Thu Oct 1 06:16:09 2020 -0600 CCPP tendencies bugfixes, global restart reproducibility, halo boundary update in dycore (#208) You should be able to go back to this hash and get b4b reproducible results in develop. Unless it's something with the suite you are using that doesn't wreck havoc for the FV3_GSD_v0 suite. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#272 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALLVRYWVESO4IHJGBXB2QALSPNM5NANCNFSM4TSYP4ZQ> .

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718 e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

junwang-noaa · 2020-11-12T14:49:48Z

Moorthi, what about the standard alone fv3 run with frac_grid=.false? On Thu, Nov 12, 2020 at 9:46 AM SMoorthi-emc <notifications@github.com> wrote:

…

I made three runs this morning on wcoss/dell with FV3/MOM6/CICE5/CCPP with a suite equivalent to GFSV16 with NSST on and "frac_grid=.false." Run 1 is for 48 hours. Run2 is for 24 hours, writing restarts every 12 hours Run 3 is to restart at 12 hours and continue to 48 hours. Outputs from Run3 are identical to Run 1. The only issue I have is with "frac_grid=.true." Moorthi On Wed, Nov 11, 2020 at 11:02 PM Dom Heinzeller ***@***.***> wrote: > This must be something recent! The FV3_GSD_v0 suite passes exactly this > test with the code in https://github.com/NOAA-GSL/ufs-weather-model > (default branch is gsd/develop), it's run every time we make a commit (see > tests/rt_ccpp_gsd.conf). The last commit we merged from ufs-community / > ufs-weather-model into this branch is from October 1: > > commit 208f36d > Author: Dom Heinzeller ***@***.***> > Date: Thu Oct 1 06:16:09 2020 -0600 > > CCPP tendencies bugfixes, global restart reproducibility, halo boundary update in dycore (#208) > > You should be able to go back to this hash and get b4b reproducible > results in develop. Unless it's something with the suite you are using that > doesn't wreck havoc for the FV3_GSD_v0 suite. > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > < #272 (comment) >, > or unsubscribe > < https://github.com/notifications/unsubscribe-auth/ALLVRYWVESO4IHJGBXB2QALSPNM5NANCNFSM4TSYP4ZQ > > . > -- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718 e-mail: ***@***.*** Phone: (301) 683-3718 Fax: (301) 683-3718 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#272 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AI7D6TI4QC5RTOQU5WHYM4TSPPYLVANCNFSM4TSYP4ZQ> .

SMoorthi-emc · 2020-11-12T14:54:09Z

I did not run that test with CCPP. I have done with IPD and that reproduces (IPD reproduces with frac_grid=.true. also). I can do that with CCPP next.

…

On Thu, Nov 12, 2020 at 9:50 AM Jun Wang ***@***.***> wrote: Moorthi, what about the standard alone fv3 run with frac_grid=.false? On Thu, Nov 12, 2020 at 9:46 AM SMoorthi-emc ***@***.***> wrote: > I made three runs this morning on wcoss/dell with FV3/MOM6/CICE5/CCPP with > a suite equivalent to GFSV16 with NSST on and "frac_grid=.false." > Run 1 is for 48 hours. > Run2 is for 24 hours, writing restarts every 12 hours > Run 3 is to restart at 12 hours and continue to 48 hours. > Outputs from Run3 are identical to Run 1. > The only issue I have is with "frac_grid=.true." > Moorthi > > > > On Wed, Nov 11, 2020 at 11:02 PM Dom Heinzeller < ***@***.***> > wrote: > > > This must be something recent! The FV3_GSD_v0 suite passes exactly this > > test with the code in https://github.com/NOAA-GSL/ufs-weather-model > > (default branch is gsd/develop), it's run every time we make a commit > (see > > tests/rt_ccpp_gsd.conf). The last commit we merged from ufs-community / > > ufs-weather-model into this branch is from October 1: > > > > commit 208f36d > > Author: Dom Heinzeller ***@***.***> > > Date: Thu Oct 1 06:16:09 2020 -0600 > > > > CCPP tendencies bugfixes, global restart reproducibility, halo boundary > update in dycore (#208) > > > > You should be able to go back to this hash and get b4b reproducible > > results in develop. Unless it's something with the suite you are using > that > > doesn't wreck havoc for the FV3_GSD_v0 suite. > > > > — > > You are receiving this because you are subscribed to this thread. > > Reply to this email directly, view it on GitHub > > < > #272 (comment) > >, > > or unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/ALLVRYWVESO4IHJGBXB2QALSPNM5NANCNFSM4TSYP4ZQ > > > > . > > > > > -- > Dr. Shrinivas Moorthi > Research Meteorologist > Modeling and Data Assimilation Branch > Environmental Modeling Center / National Centers for Environmental > Prediction > 5830 University Research Court - (W/NP23), College Park MD 20740 USA > Tel: (301)683-3718 > > e-mail: ***@***.*** > Phone: (301) 683-3718 Fax: (301) 683-3718 > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > < #272 (comment) >, > or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AI7D6TI4QC5RTOQU5WHYM4TSPPYLVANCNFSM4TSYP4ZQ > > . > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#272 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALLVRYUAZROCPSZBQQFSCHTSPPYZ3ANCNFSM4TSYP4ZQ> .

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718 e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

SMoorthi-emc · 2020-11-12T15:51:42Z

OK, I made three more runs - standalone FV3 runs using the coupled executable with CCPP suite "suite_FV3_GFS_v17_cpldnsstsas.xml" with "cplflx=.false." and "frac_grid=.false." (I can run in an uncoupled model with a coupled executable). The restart run is identical to the continuous run. Moorthi On Thu, Nov 12, 2020 at 9:53 AM Shrinivas Moorthi - NOAA Federal < shrinivas.moorthi@noaa.gov> wrote:

…

I did not run that test with CCPP. I have done with IPD and that reproduces (IPD reproduces with frac_grid=.true. also). I can do that with CCPP next. On Thu, Nov 12, 2020 at 9:50 AM Jun Wang ***@***.***> wrote: > Moorthi, what about the standard alone fv3 run with frac_grid=.false? > > On Thu, Nov 12, 2020 at 9:46 AM SMoorthi-emc ***@***.***> > wrote: > > > I made three runs this morning on wcoss/dell with FV3/MOM6/CICE5/CCPP > with > > a suite equivalent to GFSV16 with NSST on and "frac_grid=.false." > > Run 1 is for 48 hours. > > Run2 is for 24 hours, writing restarts every 12 hours > > Run 3 is to restart at 12 hours and continue to 48 hours. > > Outputs from Run3 are identical to Run 1. > > The only issue I have is with "frac_grid=.true." > > Moorthi > > > > > > > > On Wed, Nov 11, 2020 at 11:02 PM Dom Heinzeller < > ***@***.***> > > wrote: > > > > > This must be something recent! The FV3_GSD_v0 suite passes exactly > this > > > test with the code in https://github.com/NOAA-GSL/ufs-weather-model > > > (default branch is gsd/develop), it's run every time we make a commit > > (see > > > tests/rt_ccpp_gsd.conf). The last commit we merged from ufs-community > / > > > ufs-weather-model into this branch is from October 1: > > > > > > commit 208f36d > > > Author: Dom Heinzeller ***@***.***> > > > Date: Thu Oct 1 06:16:09 2020 -0600 > > > > > > CCPP tendencies bugfixes, global restart reproducibility, halo > boundary > > update in dycore (#208) > > > > > > You should be able to go back to this hash and get b4b reproducible > > > results in develop. Unless it's something with the suite you are using > > that > > > doesn't wreck havoc for the FV3_GSD_v0 suite. > > > > > > — > > > You are receiving this because you are subscribed to this thread. > > > Reply to this email directly, view it on GitHub > > > < > > > #272 (comment) > > >, > > > or unsubscribe > > > < > > > https://github.com/notifications/unsubscribe-auth/ALLVRYWVESO4IHJGBXB2QALSPNM5NANCNFSM4TSYP4ZQ > > > > > > . > > > > > > > > > -- > > Dr. Shrinivas Moorthi > > Research Meteorologist > > Modeling and Data Assimilation Branch > > Environmental Modeling Center / National Centers for Environmental > > Prediction > > 5830 University Research Court - (W/NP23), College Park MD 20740 USA > > Tel: (301)683-3718 > > > > e-mail: ***@***.*** > > Phone: (301) 683-3718 Fax: (301) 683-3718 > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > < > #272 (comment) > >, > > or unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/AI7D6TI4QC5RTOQU5WHYM4TSPPYLVANCNFSM4TSYP4ZQ > > > > . > > > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#272 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/ALLVRYUAZROCPSZBQQFSCHTSPPYZ3ANCNFSM4TSYP4ZQ> > . > -- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718 e-mail: ***@***.*** Phone: (301) 683-3718 Fax: (301) 683-3718

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718 e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

junwang-noaa · 2020-11-12T16:06:52Z

Thanks, Moorthi. There might be some difference between coupled and uncoupled settings. The restart reproducibility issue happened in standalone fv3 executable. Can you run the same case with the standalone fv3 executable? On Thu, Nov 12, 2020 at 10:52 AM SMoorthi-emc <notifications@github.com> wrote:

…

OK, I made three more runs - standalone FV3 runs using the coupled executable with CCPP suite "suite_FV3_GFS_v17_cpldnsstsas.xml" with "cplflx=.false." and "frac_grid=.false." (I can run in an uncoupled model with a coupled executable). The restart run is identical to the continuous run. Moorthi On Thu, Nov 12, 2020 at 9:53 AM Shrinivas Moorthi - NOAA Federal < ***@***.***> wrote: > I did not run that test with CCPP. I have done with IPD and that > reproduces (IPD reproduces with frac_grid=.true. also). > I can do that with CCPP next. > > On Thu, Nov 12, 2020 at 9:50 AM Jun Wang ***@***.***> wrote: > >> Moorthi, what about the standard alone fv3 run with frac_grid=.false? >> >> On Thu, Nov 12, 2020 at 9:46 AM SMoorthi-emc ***@***.***> >> wrote: >> >> > I made three runs this morning on wcoss/dell with FV3/MOM6/CICE5/CCPP >> with >> > a suite equivalent to GFSV16 with NSST on and "frac_grid=.false." >> > Run 1 is for 48 hours. >> > Run2 is for 24 hours, writing restarts every 12 hours >> > Run 3 is to restart at 12 hours and continue to 48 hours. >> > Outputs from Run3 are identical to Run 1. >> > The only issue I have is with "frac_grid=.true." >> > Moorthi >> > >> > >> > >> > On Wed, Nov 11, 2020 at 11:02 PM Dom Heinzeller < >> ***@***.***> >> > wrote: >> > >> > > This must be something recent! The FV3_GSD_v0 suite passes exactly >> this >> > > test with the code in https://github.com/NOAA-GSL/ufs-weather-model >> > > (default branch is gsd/develop), it's run every time we make a commit >> > (see >> > > tests/rt_ccpp_gsd.conf). The last commit we merged from ufs-community >> / >> > > ufs-weather-model into this branch is from October 1: >> > > >> > > commit 208f36d >> > > Author: Dom Heinzeller ***@***.***> >> > > Date: Thu Oct 1 06:16:09 2020 -0600 >> > > >> > > CCPP tendencies bugfixes, global restart reproducibility, halo >> boundary >> > update in dycore (#208) >> > > >> > > You should be able to go back to this hash and get b4b reproducible >> > > results in develop. Unless it's something with the suite you are using >> > that >> > > doesn't wreck havoc for the FV3_GSD_v0 suite. >> > > >> > > — >> > > You are receiving this because you are subscribed to this thread. >> > > Reply to this email directly, view it on GitHub >> > > < >> > >> #272 (comment) >> > >, >> > > or unsubscribe >> > > < >> > >> https://github.com/notifications/unsubscribe-auth/ALLVRYWVESO4IHJGBXB2QALSPNM5NANCNFSM4TSYP4ZQ >> > > >> > > . >> > > >> > >> > >> > -- >> > Dr. Shrinivas Moorthi >> > Research Meteorologist >> > Modeling and Data Assimilation Branch >> > Environmental Modeling Center / National Centers for Environmental >> > Prediction >> > 5830 University Research Court - (W/NP23), College Park MD 20740 USA >> > Tel: (301)683-3718 >> > >> > e-mail: ***@***.*** >> > Phone: (301) 683-3718 Fax: (301) 683-3718 >> > >> > — >> > You are receiving this because you authored the thread. >> > Reply to this email directly, view it on GitHub >> > < >> #272 (comment) >> >, >> > or unsubscribe >> > < >> https://github.com/notifications/unsubscribe-auth/AI7D6TI4QC5RTOQU5WHYM4TSPPYLVANCNFSM4TSYP4ZQ >> > >> > . >> > >> >> — >> You are receiving this because you commented. >> Reply to this email directly, view it on GitHub >> < #272 (comment) >, >> or unsubscribe >> < https://github.com/notifications/unsubscribe-auth/ALLVRYUAZROCPSZBQQFSCHTSPPYZ3ANCNFSM4TSYP4ZQ > >> . >> > > > -- > Dr. Shrinivas Moorthi > Research Meteorologist > Modeling and Data Assimilation Branch > Environmental Modeling Center / National Centers for Environmental > Prediction > 5830 University Research Court - (W/NP23), College Park MD 20740 USA > Tel: (301)683-3718 > > e-mail: ***@***.*** > Phone: (301) 683-3718 Fax: (301) 683-3718 > -- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718 e-mail: ***@***.*** Phone: (301) 683-3718 Fax: (301) 683-3718 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#272 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AI7D6TOKCLLQGUUX724UIDLSPQACBANCNFSM4TSYP4ZQ> .

SMoorthi-emc · 2020-11-12T16:11:22Z

Jun, I have run the IPD version compiled in standalone mode and it does reproduce, even when "frac_grid" is true. I haven't compiled the ccpp version in uncoupled mode (that will need a different SDF file).

…

On Thu, Nov 12, 2020 at 11:07 AM Jun Wang ***@***.***> wrote: Thanks, Moorthi. There might be some difference between coupled and uncoupled settings. The restart reproducibility issue happened in standalone fv3 executable. Can you run the same case with the standalone fv3 executable? On Thu, Nov 12, 2020 at 10:52 AM SMoorthi-emc ***@***.***> wrote: > OK, I made three more runs - standalone FV3 runs using the coupled > executable with CCPP suite "suite_FV3_GFS_v17_cpldnsstsas.xml" > with "cplflx=.false." and "frac_grid=.false." > (I can run in an uncoupled model with a coupled executable). > The restart run is identical to the continuous run. > Moorthi > > On Thu, Nov 12, 2020 at 9:53 AM Shrinivas Moorthi - NOAA Federal < > ***@***.***> wrote: > > > I did not run that test with CCPP. I have done with IPD and that > > reproduces (IPD reproduces with frac_grid=.true. also). > > I can do that with CCPP next. > > > > On Thu, Nov 12, 2020 at 9:50 AM Jun Wang ***@***.***> > wrote: > > > >> Moorthi, what about the standard alone fv3 run with frac_grid=.false? > >> > >> On Thu, Nov 12, 2020 at 9:46 AM SMoorthi-emc < ***@***.***> > >> wrote: > >> > >> > I made three runs this morning on wcoss/dell with FV3/MOM6/CICE5/CCPP > >> with > >> > a suite equivalent to GFSV16 with NSST on and "frac_grid=.false." > >> > Run 1 is for 48 hours. > >> > Run2 is for 24 hours, writing restarts every 12 hours > >> > Run 3 is to restart at 12 hours and continue to 48 hours. > >> > Outputs from Run3 are identical to Run 1. > >> > The only issue I have is with "frac_grid=.true." > >> > Moorthi > >> > > >> > > >> > > >> > On Wed, Nov 11, 2020 at 11:02 PM Dom Heinzeller < > >> ***@***.***> > >> > wrote: > >> > > >> > > This must be something recent! The FV3_GSD_v0 suite passes exactly > >> this > >> > > test with the code in https://github.com/NOAA-GSL/ufs-weather-model > >> > > (default branch is gsd/develop), it's run every time we make a > commit > >> > (see > >> > > tests/rt_ccpp_gsd.conf). The last commit we merged from > ufs-community > >> / > >> > > ufs-weather-model into this branch is from October 1: > >> > > > >> > > commit 208f36d > >> > > Author: Dom Heinzeller ***@***.***> > >> > > Date: Thu Oct 1 06:16:09 2020 -0600 > >> > > > >> > > CCPP tendencies bugfixes, global restart reproducibility, halo > >> boundary > >> > update in dycore (#208) > >> > > > >> > > You should be able to go back to this hash and get b4b reproducible > >> > > results in develop. Unless it's something with the suite you are > using > >> > that > >> > > doesn't wreck havoc for the FV3_GSD_v0 suite. > >> > > > >> > > — > >> > > You are receiving this because you are subscribed to this thread. > >> > > Reply to this email directly, view it on GitHub > >> > > < > >> > > >> > #272 (comment) > >> > >, > >> > > or unsubscribe > >> > > < > >> > > >> > https://github.com/notifications/unsubscribe-auth/ALLVRYWVESO4IHJGBXB2QALSPNM5NANCNFSM4TSYP4ZQ > >> > > > >> > > . > >> > > > >> > > >> > > >> > -- > >> > Dr. Shrinivas Moorthi > >> > Research Meteorologist > >> > Modeling and Data Assimilation Branch > >> > Environmental Modeling Center / National Centers for Environmental > >> > Prediction > >> > 5830 University Research Court - (W/NP23), College Park MD 20740 USA > >> > Tel: (301)683-3718 > >> > > >> > e-mail: ***@***.*** > >> > Phone: (301) 683-3718 Fax: (301) 683-3718 > >> > > >> > — > >> > You are receiving this because you authored the thread. > >> > Reply to this email directly, view it on GitHub > >> > < > >> > #272 (comment) > >> >, > >> > or unsubscribe > >> > < > >> > https://github.com/notifications/unsubscribe-auth/AI7D6TI4QC5RTOQU5WHYM4TSPPYLVANCNFSM4TSYP4ZQ > >> > > >> > . > >> > > >> > >> — > >> You are receiving this because you commented. > >> Reply to this email directly, view it on GitHub > >> < > #272 (comment) > >, > >> or unsubscribe > >> < > https://github.com/notifications/unsubscribe-auth/ALLVRYUAZROCPSZBQQFSCHTSPPYZ3ANCNFSM4TSYP4ZQ > > > >> . > >> > > > > > > -- > > Dr. Shrinivas Moorthi > > Research Meteorologist > > Modeling and Data Assimilation Branch > > Environmental Modeling Center / National Centers for Environmental > > Prediction > > 5830 University Research Court - (W/NP23), College Park MD 20740 USA > > Tel: (301)683-3718 > > > > e-mail: ***@***.*** > > Phone: (301) 683-3718 Fax: (301) 683-3718 > > > > > -- > Dr. Shrinivas Moorthi > Research Meteorologist > Modeling and Data Assimilation Branch > Environmental Modeling Center / National Centers for Environmental > Prediction > 5830 University Research Court - (W/NP23), College Park MD 20740 USA > Tel: (301)683-3718 > > e-mail: ***@***.*** > Phone: (301) 683-3718 Fax: (301) 683-3718 > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > < #272 (comment) >, > or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AI7D6TOKCLLQGUUX724UIDLSPQACBANCNFSM4TSYP4ZQ > > . > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#272 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALLVRYV56OXENTVSJ7D62FLSPQB2ZANCNFSM4TSYP4ZQ> .

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718 e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

SMoorthi-emc · 2020-11-12T18:50:47Z

I compiled the standalone FV3 (64 bit dynamics) with CCPP and made the runs with frac_grid=.false. Restart run not only reproduces, the result is identical to that from the coupled executable run in uncoupled mode. On Thu, Nov 12, 2020 at 11:11 AM Shrinivas Moorthi - NOAA Federal < shrinivas.moorthi@noaa.gov> wrote:

…

Jun, I have run the IPD version compiled in standalone mode and it does reproduce, even when "frac_grid" is true. I haven't compiled the ccpp version in uncoupled mode (that will need a different SDF file). On Thu, Nov 12, 2020 at 11:07 AM Jun Wang ***@***.***> wrote: > Thanks, Moorthi. There might be some difference between coupled and > uncoupled settings. The restart reproducibility issue happened in > standalone fv3 executable. Can you run the same case with the standalone > fv3 executable? > > On Thu, Nov 12, 2020 at 10:52 AM SMoorthi-emc ***@***.***> > wrote: > > > OK, I made three more runs - standalone FV3 runs using the coupled > > executable with CCPP suite "suite_FV3_GFS_v17_cpldnsstsas.xml" > > with "cplflx=.false." and "frac_grid=.false." > > (I can run in an uncoupled model with a coupled executable). > > The restart run is identical to the continuous run. > > Moorthi > > > > On Thu, Nov 12, 2020 at 9:53 AM Shrinivas Moorthi - NOAA Federal < > > ***@***.***> wrote: > > > > > I did not run that test with CCPP. I have done with IPD and that > > > reproduces (IPD reproduces with frac_grid=.true. also). > > > I can do that with CCPP next. > > > > > > On Thu, Nov 12, 2020 at 9:50 AM Jun Wang ***@***.***> > > wrote: > > > > > >> Moorthi, what about the standard alone fv3 run with frac_grid=.false? > > >> > > >> On Thu, Nov 12, 2020 at 9:46 AM SMoorthi-emc < > ***@***.***> > > >> wrote: > > >> > > >> > I made three runs this morning on wcoss/dell with > FV3/MOM6/CICE5/CCPP > > >> with > > >> > a suite equivalent to GFSV16 with NSST on and "frac_grid=.false." > > >> > Run 1 is for 48 hours. > > >> > Run2 is for 24 hours, writing restarts every 12 hours > > >> > Run 3 is to restart at 12 hours and continue to 48 hours. > > >> > Outputs from Run3 are identical to Run 1. > > >> > The only issue I have is with "frac_grid=.true." > > >> > Moorthi > > >> > > > >> > > > >> > > > >> > On Wed, Nov 11, 2020 at 11:02 PM Dom Heinzeller < > > >> ***@***.***> > > >> > wrote: > > >> > > > >> > > This must be something recent! The FV3_GSD_v0 suite passes > exactly > > >> this > > >> > > test with the code in > https://github.com/NOAA-GSL/ufs-weather-model > > >> > > (default branch is gsd/develop), it's run every time we make a > > commit > > >> > (see > > >> > > tests/rt_ccpp_gsd.conf). The last commit we merged from > > ufs-community > > >> / > > >> > > ufs-weather-model into this branch is from October 1: > > >> > > > > >> > > commit 208f36d > > >> > > Author: Dom Heinzeller ***@***.***> > > >> > > Date: Thu Oct 1 06:16:09 2020 -0600 > > >> > > > > >> > > CCPP tendencies bugfixes, global restart reproducibility, halo > > >> boundary > > >> > update in dycore (#208) > > >> > > > > >> > > You should be able to go back to this hash and get b4b > reproducible > > >> > > results in develop. Unless it's something with the suite you are > > using > > >> > that > > >> > > doesn't wreck havoc for the FV3_GSD_v0 suite. > > >> > > > > >> > > — > > >> > > You are receiving this because you are subscribed to this thread. > > >> > > Reply to this email directly, view it on GitHub > > >> > > < > > >> > > > >> > > > #272 (comment) > > >> > >, > > >> > > or unsubscribe > > >> > > < > > >> > > > >> > > > https://github.com/notifications/unsubscribe-auth/ALLVRYWVESO4IHJGBXB2QALSPNM5NANCNFSM4TSYP4ZQ > > >> > > > > >> > > . > > >> > > > > >> > > > >> > > > >> > -- > > >> > Dr. Shrinivas Moorthi > > >> > Research Meteorologist > > >> > Modeling and Data Assimilation Branch > > >> > Environmental Modeling Center / National Centers for Environmental > > >> > Prediction > > >> > 5830 University Research Court - (W/NP23), College Park MD 20740 > USA > > >> > Tel: (301)683-3718 > > >> > > > >> > e-mail: ***@***.*** > > >> > Phone: (301) 683-3718 Fax: (301) 683-3718 > > >> > > > >> > — > > >> > You are receiving this because you authored the thread. > > >> > Reply to this email directly, view it on GitHub > > >> > < > > >> > > > #272 (comment) > > >> >, > > >> > or unsubscribe > > >> > < > > >> > > > https://github.com/notifications/unsubscribe-auth/AI7D6TI4QC5RTOQU5WHYM4TSPPYLVANCNFSM4TSYP4ZQ > > >> > > > >> > . > > >> > > > >> > > >> — > > >> You are receiving this because you commented. > > >> Reply to this email directly, view it on GitHub > > >> < > > > #272 (comment) > > >, > > >> or unsubscribe > > >> < > > > https://github.com/notifications/unsubscribe-auth/ALLVRYUAZROCPSZBQQFSCHTSPPYZ3ANCNFSM4TSYP4ZQ > > > > > >> . > > >> > > > > > > > > > -- > > > Dr. Shrinivas Moorthi > > > Research Meteorologist > > > Modeling and Data Assimilation Branch > > > Environmental Modeling Center / National Centers for Environmental > > > Prediction > > > 5830 University Research Court - (W/NP23), College Park MD 20740 USA > > > Tel: (301)683-3718 > > > > > > e-mail: ***@***.*** > > > Phone: (301) 683-3718 Fax: (301) 683-3718 > > > > > > > > > -- > > Dr. Shrinivas Moorthi > > Research Meteorologist > > Modeling and Data Assimilation Branch > > Environmental Modeling Center / National Centers for Environmental > > Prediction > > 5830 University Research Court - (W/NP23), College Park MD 20740 USA > > Tel: (301)683-3718 > > > > e-mail: ***@***.*** > > Phone: (301) 683-3718 Fax: (301) 683-3718 > > > > — > > You are receiving this because you authored the thread. > > Reply to this email directly, view it on GitHub > > < > #272 (comment) > >, > > or unsubscribe > > < > https://github.com/notifications/unsubscribe-auth/AI7D6TOKCLLQGUUX724UIDLSPQACBANCNFSM4TSYP4ZQ > > > > . > > > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#272 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/ALLVRYV56OXENTVSJ7D62FLSPQB2ZANCNFSM4TSYP4ZQ> > . > -- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718 e-mail: ***@***.*** Phone: (301) 683-3718 Fax: (301) 683-3718

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718 e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

SMoorthi-emc · 2020-11-14T01:59:13Z

Hi Jun, I struggled for a few days to get the restart working with the fractional grid and CCPP. The origin of the error was so tiny and random that I could not figure out what might be causing the issue. Also, restart run with the same code in IPD reproduced. So, finally this evening I decided to compile in REPRO mode. Interestingly, the restart run now reproduces for frac_grid=.true. with CCPP. So, I guess it is not a bug in the code after all! Any suggestion how to make it work in non-REPRO mode is welcome. Thanks Moorthi On Thu, Nov 12, 2020 at 1:50 PM Shrinivas Moorthi - NOAA Federal < shrinivas.moorthi@noaa.gov> wrote:

…

I compiled the standalone FV3 (64 bit dynamics) with CCPP and made the runs with frac_grid=.false. Restart run not only reproduces, the result is identical to that from the coupled executable run in uncoupled mode. On Thu, Nov 12, 2020 at 11:11 AM Shrinivas Moorthi - NOAA Federal < ***@***.***> wrote: > Jun, > I have run the IPD version compiled in standalone mode and it does > reproduce, even when "frac_grid" is true. > I haven't compiled the ccpp version in uncoupled mode (that will need a > different SDF file). > > > On Thu, Nov 12, 2020 at 11:07 AM Jun Wang ***@***.***> > wrote: > >> Thanks, Moorthi. There might be some difference between coupled and >> uncoupled settings. The restart reproducibility issue happened in >> standalone fv3 executable. Can you run the same case with the standalone >> fv3 executable? >> >> On Thu, Nov 12, 2020 at 10:52 AM SMoorthi-emc ***@***.***> >> wrote: >> >> > OK, I made three more runs - standalone FV3 runs using the coupled >> > executable with CCPP suite "suite_FV3_GFS_v17_cpldnsstsas.xml" >> > with "cplflx=.false." and "frac_grid=.false." >> > (I can run in an uncoupled model with a coupled executable). >> > The restart run is identical to the continuous run. >> > Moorthi >> > >> > On Thu, Nov 12, 2020 at 9:53 AM Shrinivas Moorthi - NOAA Federal < >> > ***@***.***> wrote: >> > >> > > I did not run that test with CCPP. I have done with IPD and that >> > > reproduces (IPD reproduces with frac_grid=.true. also). >> > > I can do that with CCPP next. >> > > >> > > On Thu, Nov 12, 2020 at 9:50 AM Jun Wang ***@***.***> >> > wrote: >> > > >> > >> Moorthi, what about the standard alone fv3 run with >> frac_grid=.false? >> > >> >> > >> On Thu, Nov 12, 2020 at 9:46 AM SMoorthi-emc < >> ***@***.***> >> > >> wrote: >> > >> >> > >> > I made three runs this morning on wcoss/dell with >> FV3/MOM6/CICE5/CCPP >> > >> with >> > >> > a suite equivalent to GFSV16 with NSST on and "frac_grid=.false." >> > >> > Run 1 is for 48 hours. >> > >> > Run2 is for 24 hours, writing restarts every 12 hours >> > >> > Run 3 is to restart at 12 hours and continue to 48 hours. >> > >> > Outputs from Run3 are identical to Run 1. >> > >> > The only issue I have is with "frac_grid=.true." >> > >> > Moorthi >> > >> > >> > >> > >> > >> > >> > >> > On Wed, Nov 11, 2020 at 11:02 PM Dom Heinzeller < >> > >> ***@***.***> >> > >> > wrote: >> > >> > >> > >> > > This must be something recent! The FV3_GSD_v0 suite passes >> exactly >> > >> this >> > >> > > test with the code in >> https://github.com/NOAA-GSL/ufs-weather-model >> > >> > > (default branch is gsd/develop), it's run every time we make a >> > commit >> > >> > (see >> > >> > > tests/rt_ccpp_gsd.conf). The last commit we merged from >> > ufs-community >> > >> / >> > >> > > ufs-weather-model into this branch is from October 1: >> > >> > > >> > >> > > commit 208f36d >> > >> > > Author: Dom Heinzeller ***@***.***> >> > >> > > Date: Thu Oct 1 06:16:09 2020 -0600 >> > >> > > >> > >> > > CCPP tendencies bugfixes, global restart reproducibility, halo >> > >> boundary >> > >> > update in dycore (#208) >> > >> > > >> > >> > > You should be able to go back to this hash and get b4b >> reproducible >> > >> > > results in develop. Unless it's something with the suite you are >> > using >> > >> > that >> > >> > > doesn't wreck havoc for the FV3_GSD_v0 suite. >> > >> > > >> > >> > > — >> > >> > > You are receiving this because you are subscribed to this >> thread. >> > >> > > Reply to this email directly, view it on GitHub >> > >> > > < >> > >> > >> > >> >> > >> #272 (comment) >> > >> > >, >> > >> > > or unsubscribe >> > >> > > < >> > >> > >> > >> >> > >> https://github.com/notifications/unsubscribe-auth/ALLVRYWVESO4IHJGBXB2QALSPNM5NANCNFSM4TSYP4ZQ >> > >> > > >> > >> > > . >> > >> > > >> > >> > >> > >> > >> > >> > -- >> > >> > Dr. Shrinivas Moorthi >> > >> > Research Meteorologist >> > >> > Modeling and Data Assimilation Branch >> > >> > Environmental Modeling Center / National Centers for Environmental >> > >> > Prediction >> > >> > 5830 University Research Court - (W/NP23), College Park MD 20740 >> USA >> > >> > Tel: (301)683-3718 >> > >> > >> > >> > e-mail: ***@***.*** >> > >> > Phone: (301) 683-3718 Fax: (301) 683-3718 >> > >> > >> > >> > — >> > >> > You are receiving this because you authored the thread. >> > >> > Reply to this email directly, view it on GitHub >> > >> > < >> > >> >> > >> #272 (comment) >> > >> >, >> > >> > or unsubscribe >> > >> > < >> > >> >> > >> https://github.com/notifications/unsubscribe-auth/AI7D6TI4QC5RTOQU5WHYM4TSPPYLVANCNFSM4TSYP4ZQ >> > >> > >> > >> > . >> > >> > >> > >> >> > >> — >> > >> You are receiving this because you commented. >> > >> Reply to this email directly, view it on GitHub >> > >> < >> > >> #272 (comment) >> > >, >> > >> or unsubscribe >> > >> < >> > >> https://github.com/notifications/unsubscribe-auth/ALLVRYUAZROCPSZBQQFSCHTSPPYZ3ANCNFSM4TSYP4ZQ >> > > >> > >> . >> > >> >> > > >> > > >> > > -- >> > > Dr. Shrinivas Moorthi >> > > Research Meteorologist >> > > Modeling and Data Assimilation Branch >> > > Environmental Modeling Center / National Centers for Environmental >> > > Prediction >> > > 5830 University Research Court - (W/NP23), College Park MD 20740 USA >> > > Tel: (301)683-3718 >> > > >> > > e-mail: ***@***.*** >> > > Phone: (301) 683-3718 Fax: (301) 683-3718 >> > > >> > >> > >> > -- >> > Dr. Shrinivas Moorthi >> > Research Meteorologist >> > Modeling and Data Assimilation Branch >> > Environmental Modeling Center / National Centers for Environmental >> > Prediction >> > 5830 University Research Court - (W/NP23), College Park MD 20740 USA >> > Tel: (301)683-3718 >> > >> > e-mail: ***@***.*** >> > Phone: (301) 683-3718 Fax: (301) 683-3718 >> > >> > — >> > You are receiving this because you authored the thread. >> > Reply to this email directly, view it on GitHub >> > < >> #272 (comment) >> >, >> > or unsubscribe >> > < >> https://github.com/notifications/unsubscribe-auth/AI7D6TOKCLLQGUUX724UIDLSPQACBANCNFSM4TSYP4ZQ >> > >> > . >> > >> >> — >> You are receiving this because you commented. >> Reply to this email directly, view it on GitHub >> <#272 (comment)>, >> or unsubscribe >> <https://github.com/notifications/unsubscribe-auth/ALLVRYV56OXENTVSJ7D62FLSPQB2ZANCNFSM4TSYP4ZQ> >> . >> > > > -- > Dr. Shrinivas Moorthi > Research Meteorologist > Modeling and Data Assimilation Branch > Environmental Modeling Center / National Centers for Environmental > Prediction > 5830 University Research Court - (W/NP23), College Park MD 20740 USA > Tel: (301)683-3718 > > e-mail: ***@***.*** > Phone: (301) 683-3718 Fax: (301) 683-3718 > -- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718 e-mail: ***@***.*** Phone: (301) 683-3718 Fax: (301) 683-3718

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718 e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

junwang-noaa · 2020-11-14T03:57:47Z

Moorthi, Thank you! That is a good news! I don't have specific suggestions, but it looks to me we may need to restrict from using the PROD compiler option on certain files. @climbfuji, do you have any suggestion? If I remember correctly, you did some similar work when reproducing the CCPP with IPD before.

climbfuji · 2020-11-15T00:01:24Z

Moorthi, Thank you! That is a good news! I don't have specific suggestions, but it looks to me we may need to restrict from using the PROD compiler option on certain files. @climbfuji, do you have any suggestion? If I remember correctly, you did some similar work when reproducing the CCPP with IPD before.

Thanks for all the detective work. I agree, we have to identify which file or routine is causing the difference, and then which of the three PROD optimizations (-xCORE-AVX2, -no-prec-div, -no-prec-sqrt). I would do this as follows, using the CCPP debugging routines in GFS_debug.F90.

put calls to GFS_diagtoscreen and GFS_interstitialtoscreen into the suite definition file, right after the interstitial_rad_reset
modify the two _run routines for the two schemes to only produce output for the kdt value that corresponds to the first timestep after the warmstart
do the full run (be sure to have --label in the srun call in the job submission script, so that the MPI rank gets prepended - this allows you to split the stdout/stderr file later by task)
do the coldstart/warmstart run
split stdout and stderr by MPI rank, then use a graphical diff tool (e.g. meld) to compare the directories and all the split files

If there are differences in the output from the two diag routines, then we need to look at the time vary group (or the dycore if GFDL-MP is used, because of the saturation adjustment). Maybe turn off the saturation adjustment for the next set of runs to see if the differences go away. If they do, it's the fv_sat_adj calls. If they don't it's in the time_vary group.

If are no differences at all in the output from the two diag routines, then it's either in the radiation or physics group or the stochastics group. In this case, add the same diagtoscreen routines immediately after the GFS_stateout_reset call in the SDF - this tells you whether it is the radiation group or not. If it is not the radiation group, add calls at the beginning of the GFS_stochastics group - tells you whether it is in the physics or the stochastic group.

Once we know the group, use an iterative approach (bisect the group / block in the group) until the scheme is identified.

Moorthi, I can do all this for you if you like. What I would need is

a complete, self-contained run directory with a job submission script where the only thing I have to do is link the executable and possible a modulefile
a complete source code directory with instructions on how you compile the code
on a machine that is not WCOSS, unfortunately

Hope this helps.

junwang-noaa · 2020-12-02T19:29:19Z

I believe we still need to test the restart reproducibility on standalone global FV3 after PR#304. I haven't seen results on restart test from 24->48h yet.

climbfuji · 2020-12-02T20:55:30Z

@SMoorthi-emc @junwang-noaa has this issue been fixed with today's merge of #304?

junwang-noaa · 2020-12-02T21:02:18Z

Working on regional inline post, I haven't got time to run some tests yet.

SMoorthi-emc · 2020-12-03T00:41:48Z

I don't know. In my own tests, I have to run in REPRO mode with CCPP.

climbfuji · 2020-12-15T03:07:20Z

Closed via #325.

* fix lam post uninitialied fields * remove spval in openmp * add more uninitialized post fields * update suite_FV3_GFS_v15_thompson_mynn_lam3km.xml to use mynnsfc_wrapper instead of sfc_diff

junwang-noaa added the bug Something isn't working label Nov 12, 2020

climbfuji mentioned this issue Dec 1, 2020

Update for Jet, bug fixes in running with frac_grid=T and GFDL MP, and in restarting with frac_grid=T #304

Merged

DusanJovic-NOAA closed this as completed in #304 Dec 2, 2020

junwang-noaa reopened this Dec 2, 2020

climbfuji mentioned this issue Dec 10, 2020

Final-final GFS v16 updates / restart reproducibility bugfixes #325

Merged

climbfuji closed this as completed Dec 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

restart reproducibility issue for global fv3 runs #272

restart reproducibility issue for global fv3 runs #272

junwang-noaa commented Nov 12, 2020

climbfuji commented Nov 12, 2020

SMoorthi-emc commented Nov 12, 2020 via email

junwang-noaa commented Nov 12, 2020 via email

SMoorthi-emc commented Nov 12, 2020 via email

SMoorthi-emc commented Nov 12, 2020 via email

junwang-noaa commented Nov 12, 2020 via email

SMoorthi-emc commented Nov 12, 2020 via email

SMoorthi-emc commented Nov 12, 2020 via email

SMoorthi-emc commented Nov 14, 2020 via email

junwang-noaa commented Nov 14, 2020

climbfuji commented Nov 15, 2020

junwang-noaa commented Dec 2, 2020

climbfuji commented Dec 2, 2020

junwang-noaa commented Dec 2, 2020

SMoorthi-emc commented Dec 3, 2020

climbfuji commented Dec 15, 2020

restart reproducibility issue for global fv3 runs #272

restart reproducibility issue for global fv3 runs #272

Comments

junwang-noaa commented Nov 12, 2020

Description

To Reproduce:

climbfuji commented Nov 12, 2020

SMoorthi-emc commented Nov 12, 2020 via email

junwang-noaa commented Nov 12, 2020 via email

SMoorthi-emc commented Nov 12, 2020 via email

SMoorthi-emc commented Nov 12, 2020 via email

junwang-noaa commented Nov 12, 2020 via email

SMoorthi-emc commented Nov 12, 2020 via email

SMoorthi-emc commented Nov 12, 2020 via email

SMoorthi-emc commented Nov 14, 2020 via email

junwang-noaa commented Nov 14, 2020

climbfuji commented Nov 15, 2020

junwang-noaa commented Dec 2, 2020

climbfuji commented Dec 2, 2020

junwang-noaa commented Dec 2, 2020

SMoorthi-emc commented Dec 3, 2020

climbfuji commented Dec 15, 2020