Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PBL, Convection and Microphysics Update for HR2 #1731

Merged
merged 18 commits into from
May 9, 2023

Conversation

Qingfu-Liu
Copy link
Collaborator

@Qingfu-Liu Qingfu-Liu commented Apr 29, 2023

Description

This PR is the same as PR#1723, and is created because the ccpp-physics is updated for HR2 (PR#65). The ccpp-physics update in PR#65 includes update for PBL scheme, shallow convection scheme, deep convection scheme and microphysics scheme. The changes of the physics improve hurricane forecast, and CAPE forecast values.

Input data additions/changes

  • No changes are expected to input data.
  • There will be new input data.
  • Input data will be updated.

Anticipated changes to regression tests:

  • No changes are expected to any regression test.
  • Changes are expected to the following tests:

control, cpld, and hafs cases change result: see the list below in conversation

Subcomponents involved:

  • AQM
  • CDEPS
  • CICE
  • CMEPS
  • CMakeModules
  • FV3
  • GOCART
  • HYCOM
  • MOM6
  • NOAHMP
  • WW3
  • stochastic_physics
  • none

Combined with PR's (If Applicable):

Commit Queue Checklist:

Linked PR's and Issues:

ufs-community/ccpp-physics#65
NOAA-EMC/fv3atm#653

Testing Day Checklist:

  • This PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR.
  • Move new/updated input data on RDHPCS Hera and propagate input data changes to all supported systems.

Testing Log (for CM's):

  • RDHPCS
    • Intel
      • Hera
      • Orion
      • Jet
      • Gaea
      • Cheyenne
    • GNU
      • Hera
      • Cheyenne
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
    • Completed
  • opnReqTest
    • N/A
    • Log attached to comment

@Qingfu-Liu
Copy link
Collaborator Author

PR#1731 is created and replaces the closed PR#1723. This PR#1731 has merged the ufs-community/ufs-weather-model develop branch into the branch update_HR2.

@Qingfu-Liu
Copy link
Collaborator Author

Has a problem using ecflow to run the rt.sh tests, anyone know how to fix this problem?:

  • mkdir /scratch1/NCEPDEV/global/Qingfu.Liu/git/ufs-weather-model/tests/lock
    mkdir: cannot create directory '/scratch1/NCEPDEV/global/Qingfu.Liu/git/ufs-weather-model/tests/lock': File exists
  • echo 'Only one instance of rt.sh can be running at a time'
    Only one instance of rt.sh can be running at a time

@jkbk2004
Copy link
Collaborator

jkbk2004 commented May 2, 2023

@Qingfu-Liu for an example, a general procedure to sync can be

  1. Clone your fork and checkout branch that needs syncing:
    git clone https://github.com/JoeSmith-NOAA/ufs-weather-model.git ./fork
    cd fork
    git checkout feature/branch
  2. Add upstream info to your clone so it knows where to merge from. The term “upstream” refers to the authoritative rep
    ository from which the fork was created.
    git remote add upstream https://github.com/ufs-community/ufs-weather-model.git
  3. Fetch upstream information into clone:
    git fetch upstream
  4. Later on you can update your fork remote information by doing the following command:
    git remote update
  5. Merge upstream feature/branch (or develop) into your branch: git merge upstream/feature/branch (or develop)
  6. Resolve any conflicts and perform any needed “add”s or “commit”s for conflict resolution.
  7. Push the merged copy back up to your fork (origin):
    git push origin feature/branch

For submodule component, you can sync up with a similar way.

@Qingfu-Liu
Copy link
Collaborator Author

Qingfu-Liu commented May 3, 2023 via email

@Qingfu-Liu
Copy link
Collaborator Author

Qingfu-Liu commented May 3, 2023 via email

@jkbk2004
Copy link
Collaborator

jkbk2004 commented May 3, 2023

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented May 3, 2023

What is going on w/ Cheyenne.gnu?

baseline dir = /glade/scratch/epicufsrt/GMTB/ufs-weather-model/RT/NEMSfv3gfs/develop-20230430/GNU/cpld_control_p8
working dir  = /glade/scratch/jongkim/rt-1731-gnu/jongkim/FV3_RT/rt_43398/cpld_control_p8
Checking test 051 cpld_control_p8 results ....
 Comparing sfcf021.tile1.nc ............ALT CHECK......ERROR

@DeniseWorthen
Copy link
Collaborator

I also don't understand how all these coupled tests are giving "alt check" OK. That means generally that the data matches, but the metadata doesn't. If this is actually changing physics, then how is that happening? Are the tests not capturing the changes in the physics?

@junwang-noaa
Copy link
Collaborator

@DeniseWorthen RT is using "cmp" to compare two files, if the files are different and they are netcdf files, then use "nccmp" (or compare_ncfile.py) to compare all the data fields in the two netcdf files.

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented May 3, 2023

@junwang-noaa I thought the changes in this PR would change the actual forecast results. So I would expect the comparisons to just fail because the data was different.

baseline dir = /glade/scratch/epicufsrt/GMTB/ufs-weather-model/RT/NEMSfv3gfs/develop-20230426/INTEL/cpld_control_p8
working dir  = /glade/scratch/jongkim/rt-1731-intel/jongkim/FV3_RT/rt_10944/cpld_control_p8
Checking test 003 cpld_control_p8 results ....
 Comparing sfcf021.tile1.nc ............ALT CHECK......OK
 Comparing sfcf021.tile2.nc ............ALT CHECK......OK
 Comparing sfcf021.tile3.nc ............ALT CHECK......OK
 Comparing sfcf021.tile4.nc ............ALT CHECK......OK

For Cheyenne.gnu, I'm not sure what is happening on the sfc021.tile1.nc file. It is giving "error", not "not ok"

@junwang-noaa
Copy link
Collaborator

I see. @Qingfu-Liu @jkbk2004 Can you check the results? Is this run coming from Qingfu's branch? We need a new baseline.

@Qingfu-Liu
Copy link
Collaborator Author

Qingfu-Liu commented May 3, 2023 via email

@BrianCurtis-NOAA
Copy link
Collaborator

@DeniseWorthen @junwang-noaa I believe the "ERROR" outputs are because the script and/or command fails, not that any result from the comparison has happened. So we should manually run what's done to see if we can get better information on what part of the command/script is failing.

@DeniseWorthen
Copy link
Collaborator

@BrianCurtis-NOAA Exactly. The logs always need to be checked for anomalous results.

On cheyenne.intel, I can compare the baseline vs the sfcf021.tile3.nc file from the INTEL case using cprnc and they appear completely different. Why is the comparison appearing as "alt check OK"?

@DeniseWorthen
Copy link
Collaborator

The issue seems to be the BL date:

baseline dir = /glade/scratch/epicufsrt/GMTB/ufs-weather-model/RT/NEMSfv3gfs/develop-20230426/INTEL/cpld_control_p8
working dir  = /glade/scratch/jongkim/rt-1731-intel/jongkim/FV3_RT/rt_10944/cpld_control_p8

The current BL date is 0430, right?

@BrianCurtis-NOAA
Copy link
Collaborator

BrianCurtis-NOAA commented May 3, 2023

@Qingfu-Liu
Update:

  • Talk to hera helpdesk to get access to ecflow. Then:
  • please run intel RT on Hera with ./rt.sh -e > rt.out 2>&1 &
  • please run gnu RT on Hera with export RT_COMPILER=gnu && ./rt.sh -e > rt.out 2>&1 &

@jkbk2004
Copy link
Collaborator

jkbk2004 commented May 3, 2023

@Qingfu-Liu if ecflow is issue, let me run hera.gnu

@Qingfu-Liu
Copy link
Collaborator Author

Qingfu-Liu commented May 3, 2023 via email

@Qingfu-Liu
Copy link
Collaborator Author

Qingfu-Liu commented May 3, 2023 via email

@jkbk2004
Copy link
Collaborator

jkbk2004 commented May 3, 2023

sounds like we still have exit code issue of nccmp. current exit code setup of d=$? is 1: ERROR, 2 (other than 0): NOT OK, 0: OK. But if change result, it exits with 1. @DusanJovic-NOAA FYI

@jkbk2004
Copy link
Collaborator

jkbk2004 commented May 3, 2023

Exit code 0 is returned for identical files, 1 for different files, and 2 for a fatal error.

@Qingfu-Liu
Copy link
Collaborator Author

Qingfu-Liu commented May 8, 2023 via email

@BrianCurtis-NOAA
Copy link
Collaborator

regional_atmaq_faster failed on both Acorn and WCOSS2, I have disabled them, and have filed issue #1742.

@BrianCurtis-NOAA
Copy link
Collaborator

Running comparison RT's now on Acorn/Cactus.

@BrianCurtis-NOAA
Copy link
Collaborator

BrianCurtis-NOAA commented May 9, 2023

Acorn: I left the rsync overnight to run comparisons this AM and ran into space issues that are now resolved as of this AM. I resumed the rsync of baselines and i'll start comparisons shortly after that completes.

@jkbk2004
Copy link
Collaborator

jkbk2004 commented May 9, 2023

All tests are done. We can start merging process.

@jkbk2004
Copy link
Collaborator

jkbk2004 commented May 9, 2023

@Qingfu-Liu fv3 pr was merged. correct hash is NOAA-EMC/fv3atm@160b422. Can you update the hash and revert change in gitmodules?

@jkbk2004 jkbk2004 requested review from DusanJovic-NOAA and BrianCurtis-NOAA and removed request for BrianCurtis-NOAA May 9, 2023 18:33
@jkbk2004 jkbk2004 self-requested a review May 9, 2023 19:24
@Qingfu-Liu
Copy link
Collaborator Author

Qingfu-Liu commented May 9, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Baseline Updates Current baselines will be updated. jenkins-ci Jenkins CI: ORT build/test on docker container Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants