-
Notifications
You must be signed in to change notification settings - Fork 197
Replace gfs_cyc with an interval #2928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace gfs_cyc with an interval #2928
Conversation
More manual testing to come, but ready enough to get eyes on it. |
a532f64
to
f6141f8
Compare
61d05ae
to
1e92c40
Compare
To facilitate longer and more flexible GFS cadences, the `gfs_cyc` variable is replaced with a specified interval. Up front, this is reflected in a change in the arguments for setup_exp to: ``` --interval <n_hours> ``` Where `n_hours` is the interval (in hours) between gfs forecasts. `n_hours` must be a multiple of 6. If 0, no gfs will be run (only gdas; only valid for cycled mode). The default value is 6 (every cycle). In cycled mode, there is an additional argument to control which cycle will be the first gfs cycle: ``` ---sdate_gfs <YYYYMMDDHH> ``` The default if not provided is `--idate` + 6h (first full cycle). As part of this change, some of the validation of the dates has been added. `--edate` has also been made optional and defaults to `--idate` if not provided. During `config.base` template-filling, `INTERVAL_GFS` (renamed from `STEP_GFS`) is defined as `--interval` and `SDATE_GFS as `--sdate_gfs`. Some changes were necessary to the gfs verification (metp) job, as `gfs_cyc` was being used downstream by verif-global. That has been removed, and instead workflow will be responsible for only running metp on the correct cycles. This also removes "do nothing" metp tasks that exit immediately, because only the last GFS cycle in a day would actually process verification. Now, metp has its own cycledef and always runs at 18z, regardless of whether gfs is running at 18z or not. This is simplier than trying to determine the last gfs cycle of a day when it could change from day to day. To facilitate this change, support for the undocumented rocoto dependency tag `taskvalid` is added, as the metp task needs to know whether the cycle has a gfsarch task or not. metp will trigger on gfsarch completing (as before), or gdasarch completing if there is no gfsarch. metp tasks are no longer generated for forecast-only, as the pgbanl files (copied of the 1p00 pgbanl files) are not generated for f-o anyway. If metp is needed for f-o, additional work will be needed. Additionally, a couple EE2 issues with the metp job are resolved (even though it is not run in ops): - verif-global update replaced `$CDUMP` with `$RUN` - `$DATAROOT` is no longer redefined in the metp job Depends on NOAA-EMC/EMC_verif-global#137 Resolves NOAA-EMC#260 Refs NOAA-EMC#1299
To avoid running metp on days where there is no gfs, the cycledefs are adjusted somewhat. First, if the interval is >= 24h, the metp cycledef will be identical to gfs. If the interval is < 24h, it remains 18z every day (except the last). Second, a last_gfs is added so metp will run on for the last gfs cycle even if it there is no gdas cycle for 18z that day. This required computing the real gfs end date to use as the last cycle.
18f9752
to
bfc2907
Compare
Ready for review. Also encourage third-party testing in excess of what CI will cover due to the nature of the change. I ran some different multi-day experiments myself, but outside validation would be appreciated. |
|
Automated global-workflow Testing Results:
|
bd7ed27
to
2c8dace
Compare
All manual CI tests passed on WCOSS:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the work on this @WalterKolczynski-NOAA!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks @WalterKolczynski-NOAA !
|
||
date2 = sdate_gfs + interval_gfs | ||
if date2 <= edate_gfs: | ||
date2_gfs_str = date2_gfs.strftime("%Y%m%d%H%M") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed that date2_gfs
should be date2
. I'll fix this in #2943.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised PEP8 didn't complain about this.
To facilitate longer and more flexible GFS cadences, the `gfs_cyc` variable is replaced with a specified interval. Up front, this is reflected in a change in the arguments for setup_exp to: ``` --interval <n_hours> ``` Where `n_hours` is the interval (in hours) between gfs forecasts. `n_hours` must be a multiple of 6. If 0, no gfs will be run (only gdas; only valid for cycled mode). The default value is 6 (every cycle). (This is a change from current behavior of 24.) In cycled mode, there is an additional argument to control which cycle will be the first gfs cycle: ``` --sdate_gfs <YYYYMMDDHH> ``` The default if not provided is `--idate` + 6h (first full cycle). This is the same as current behavior when `gfs_cyc` is 6, but may vary from current behavior for other cadences. As part of this change, some of the validation of the dates has been added. `--edate` has also been made optional and defaults to `--idate` if not provided. During `config.base` template-filling, `INTERVAL_GFS` (renamed from `STEP_GFS`) is defined as `--interval` and `SDATE_GFS as `--sdate_gfs`. Some changes were necessary to the gfs verification (metp) job, as `gfs_cyc` was being used downstream by verif-global. That has been removed, and instead workflow will be responsible for only running metp on the correct cycles. This also removes "do nothing" metp tasks that exit immediately, because only the last GFS cycle in a day would actually process verification. Now, metp has its own cycledef and will (a) always runs at 18z, regardless of whether gfs is running at 18z or not, if the interval is less than 24h; (b) use the same cycledef as gfs if the interval is 24h or greater. This is simpler than trying to determine the last gfs cycle of a day when it could change from day to day. To facilitate this change, support for the undocumented rocoto dependency tag `taskvalid` is added, as the metp task needs to know whether the cycle has a gfsarch task or not. metp will trigger on gfsarch completing (as before), or look backwards for the last gfsarch to exist. Additionally, a couple EE2 issues with the metp job are resolved (even though it is not run in ops): - verif-global update replaced `$CDUMP` with `$RUN` - `$DATAROOT` is no longer redefined in the metp job Also corrects some dependency issues with the extractvars job for replay and the replay CI test. Depends on NOAA-EMC/EMC_verif-global#137 Resolves NOAA-EMC#260 Refs NOAA-EMC#1299 --------- Co-authored-by: David Huber <david.huber@noaa.gov>
<!-- *** PLEASE READ *** Any PRs not following this template will be closed. Please delete all these comments before submitting the PR. Please use a short (<60 char), descriptive title for the PR title above. It should complete the sentence "If merged, this PR will _____". Capitalize the first word and do not end with a period. No content should appear above the "Description" header. If this PR is not merge-ready (e.g. it depends on other PRs not yet merged), please mark it as draft until it is ready. PRs should meet these guidelines: - Each PR should address ONE topic and have an associated issue. - No hard-coded paths or personal directories. - No temporary or backup files should be committed (including logs). - Any code that you disabled by being commented out should be removed or reenabled. --> # Description <!-- This description will become the commit message for the PR. --> <!-- Solely pointing to an issue is not an adequate description! Please use this format for your description: Describe your changes. Focus on the *what* and *why*. The *how* will be evident from the changes. In particular, be sure to note any interface changes, such as command line syntax, that will need to be communicated to users. At the end of your description, please be sure to add the issue this PR solves using the word "Resolves". If there are any issues that are related but not yet resolved (including in other repos), you may use "Refs". Resolves #1234 Refs #4321 Refs NOAA-EMC/repo#5678 --> This PR brings recent changes from the develop branch to the GEFS reforecast branch. This PR updates the GEFS reforecast branch to develop hash ac3cde5 (10/11/2024). This version of global-workflow uses the ufs-weather-model hash [6a4e09e](https://github.com/ufs-community/ufs-weather-model/tree/6a4e09e94773ffa39ce7ab6a54a885efada91f21) (9/9/2024). Furthermore, this PR ensures the following adjustments for the reforecast: - [x] Speed up rocoto by grouping post job - [x] Optimize PE configuration - [x] Remove duplicate OCNSPPT and EPBL settings - [x] Set restart_interval to fhmax - [x] Turn off SHUM in config.efcs - [x] Set FHMIN_WAV to 3 in config.base - [x] Turn off ATM history file output - [x] Change HMS=${cyc}0000 to HMS=030000 in Wavepostpnt script (#2788) - [x] Include YYYYMMDDHH (PDY) in job name - [x] Change CA seed based on case and cyc for control member and perturbed members - [x] Fix post ensemble info - [x] Add tob to ocean products (#2995 ) - [x] Move PEVPR from b group to a group for atmos products (#2995) - [x] Add option to download initial condition from HPSS - [x] Add ability to download and stage replay analysis from AWS, which is needed for the repair_replay task - [x] Add capability to run forecasts in 7-day intervals (#2928) - [x] Update defaults.yaml so that many of the reforecast-specific settings can be used by default <!-- For more on writing good commit messages, see https://cbea.ms/git-commit/ --> # Type of change - [ ] Bug fix (fixes something broken) - [ ] New feature (adds functionality) - [x] Maintenance (code refactor, clean-up, new CI test, etc.) # Change characteristics <!-- Choose YES or NO from each of the following and delete the other --> - Is this a breaking change (a change in existing functionality)? NO - Does this change require a documentation update? NO - Does this change require an update to any of the following submodules? NO - [ ] EMC verif-global <!-- NOAA-EMC/EMC_verif-global#1234 --> - [ ] GDAS <!-- NOAA-EMC/GDASApp#1234 --> - [ ] GFS-utils <!-- NOAA-EMC/gfs-utils#1234 --> - [ ] GSI <!-- NOAA-EMC/GSI#1234 --> - [ ] GSI-monitor <!-- NOAA-EMC/GSI-Monitor#1234 --> - [ ] GSI-utils <!-- NOAA-EMC/GSI-Utils#1234 --> - [ ] UFS-utils <!-- ufs-community/UFS_UTILS#1234 --> - [ ] UFS-weather-model <!-- ufs-community/ufs-weather-model#1234 --> - [ ] wxflow <!-- NOAA-EMC/wxflow#1234 --> # How has this been tested? <!-- Please list any test you conducted, including the machine. Example: - Clone and build on WCOSS - Cycled test on Orion - Forecast-only on Hera --> This branch is being tested on WCOSS2. When testing has succeeded, this PR will be marked as ready for review. # Checklist - [ ] Any dependent changes have been merged and published - [ ] My code follows the style guidelines of this project - [ ] I have performed a self-review of my own code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have documented my code, including function, input, and output descriptions - [ ] My changes generate no new warnings - [ ] New and existing tests pass with my changes - [ ] This change is covered by an existing CI test or a new one has been added - [ ] I have made corresponding changes to the system documentation if necessary --------- Co-authored-by: Wei Huang <wei.huang@noaa.gov> Co-authored-by: Kate Friedman <kate.friedman@noaa.gov> Co-authored-by: Cory Martin <cory.r.martin@noaa.gov> Co-authored-by: Andrew.Tangborn <Andrew.Tangborn@noaa.gov> Co-authored-by: Walter Kolczynski - NOAA <Walter.Kolczynski@noaa.gov> Co-authored-by: AndrewEichmann-NOAA <58948505+AndrewEichmann-NOAA@users.noreply.github.com> Co-authored-by: DavidBurrows-NCO <82525974+DavidBurrows-NCO@users.noreply.github.com> Co-authored-by: AnningCheng-NOAA <48297505+AnningCheng-NOAA@users.noreply.github.com> Co-authored-by: David Huber <69919478+DavidHuber-NOAA@users.noreply.github.com> Co-authored-by: Rahul Mahajan <aerorahul@users.noreply.github.com> Co-authored-by: AntonMFernando-NOAA <167725623+AntonMFernando-NOAA@users.noreply.github.com> Co-authored-by: BoCui-NOAA <53531984+BoCui-NOAA@users.noreply.github.com> Co-authored-by: DavidNew-NOAA <134300700+DavidNew-NOAA@users.noreply.github.com> Co-authored-by: Jeffrey Whitaker <jeffrey.s.whitaker@noaa.gov> Co-authored-by: mingshichen-noaa <48537176+mingshichen-noaa@users.noreply.github.com> Co-authored-by: Jiarui Dong <Jiarui.Dong@noaa.gov> Co-authored-by: David Huber <david.huber@noaa.gov> Co-authored-by: Guillaume Vernieres <guillaume.vernieres@gmail.com> Co-authored-by: RussTreadon-NOAA <26926959+RussTreadon-NOAA@users.noreply.github.com> Co-authored-by: Innocent Souopgui <162634017+InnocentSouopgui-NOAA@users.noreply.github.com> Co-authored-by: Neil Barton <103681022+NeilBarton-NOAA@users.noreply.github.com>
* develop: Remove WAFS files and references from `develop` (NOAA-EMC#3263) fix intel stack version number on c5 (NOAA-EMC#3258) Update gsi_monitor and ufs_utils hashes to recent hashes for C5/C6 build and run (NOAA-EMC#3252) Enable DA cycling on gaea C5/C6 (NOAA-EMC#3255) Copy post-processed sea ice increment for diagnostics (NOAA-EMC#3235) Only run METplus in the 3Dvar tests (NOAA-EMC#3245) Clone, build, and run C48_ATM and C48_S2SW on Gaea C5 and C6 (NOAA-EMC#3106) Add echgres as a dependency only for RUN=enkfgdas, not enkfgfs (NOAA-EMC#3246) Add domain level to wave gridded COM path (NOAA-EMC#3137) CI JJOB Tests using CMake (NOAA-EMC#3214) Make assorted updates to waves (NOAA-EMC#3190) Move WCOSS2 LD_LIBRARY_PATH patches to load_ufsda_modules.sh (NOAA-EMC#3236) Adding a gefs_arch task to GEFS workflow (NOAA-EMC#3211) Add additional GEFS variables needed for AI/ML applications (NOAA-EMC#3221) Add bmat task dependency to marine LETKF task (NOAA-EMC#3224) Resolve bug with LMOD_TMOD_FIND_FIRST setting affecting build on WCOSS2 (NOAA-EMC#3229) Reinstate product groups (NOAA-EMC#3208) Additional fixes for downstream jobs (NOAA-EMC#3187) Turn IAU off during staging job for cold start experiments (NOAA-EMC#3215) Update the gdas.cd hash and enable GDASApp to run on WCOSS2 (NOAA-EMC#3220) Update upload-artifact to v4 (NOAA-EMC#3216) Prevent duplicate case generation in generate_workflows.sh (NOAA-EMC#3217) Update g-w to cycle with C1152 ATM (NOAA-EMC#3206) Separate use of initial increment/perturbation file from REPLAY/+03 ICs (NOAA-EMC#3119) Update gsi_enkf hash and gsi_ver (NOAA-EMC#3207) Remove cpus-per-task from APRUN_OCNANALECEN on WCOSS2 (NOAA-EMC#3212) Remove 5WAVH from AWIPS GRIB2 parm files (NOAA-EMC#3146) Remove multi-grid wave support (NOAA-EMC#3188) Add echgres as a dependency for earc (NOAA-EMC#3202) Ensure OCNRES and ICERES have 3 digits in the archive script (NOAA-EMC#3199) Set runtime shell requirements within Jenkins Pipeline (NOAA-EMC#3171) Add efcs and epos to ufs_hybatm xml (NOAA-EMC#3192) (NOAA-EMC#3193) Fix GEFS and SFS compile flags in build_all.sh (NOAA-EMC#3197) Remove early-cycle EnKF forecast (NOAA-EMC#3185) Fix mod_icec bug in atmos_prod (NOAA-EMC#3167) Create compute build option (NOAA-EMC#3186) Support global-workflow using Rocky 8 on CSPs (NOAA-EMC#2998) Change orog gravity wave drag scheme for grid sizes less than 10km (NOAA-EMC#3175) Switch snow DA to use 2DVar for deterministic and ensemble mean (NOAA-EMC#3163) Update compression options for GEFS history files (NOAA-EMC#3184) Update compression options for high res history files (NOAA-EMC#3178) Turn DO_TEST_MODE off (NOAA-EMC#3177) Hotfix for gdas_arch div/0 (NOAA-EMC#3169) Allow building of the ufs-weather-model, WW3 pre/post execs for GFS, GEFS, SFS in the same clone of global-workflow (NOAA-EMC#3098) Switch Aerosol DA to use JCB and Jedi class (NOAA-EMC#3125) Update ufs-weather-model to 2024-12-06 commit (NOAA-EMC#3145) Enable traditional threading as an option (NOAA-EMC#3149) Update HPC_ACCOUNT on Hercules to fv3-cpu (NOAA-EMC#3164) Turn C96C48_ufs_hybatmDA and C48mx500_3DVarAOWCDA into a regression test (NOAA-EMC#3120) Update GSI analysis jobs to use COMIN/COMOUT (NOAA-EMC#3092) Update HPC Tier Definitions (NOAA-EMC#3138) Add marine hybrid envar (NOAA-EMC#3041) Archive the experiment directory along with git status/diff output (NOAA-EMC#3105) Use stochastic restart patterns on rerun (NOAA-EMC#3077) Point Jenkinsfile back to CI/ (NOAA-EMC#3139) Fix wave restart for cold start and add ic version file (NOAA-EMC#3112) Allow users to override the default account at setup time (NOAA-EMC#3127) Refactor gridded wave post (NOAA-EMC#3014) Update docs related to NOAA CSPs (NOAA-EMC#3043) Allow APP to differ between RUNs (NOAA-EMC#2943) Run one executable for soca2cice (instead of two) (NOAA-EMC#3118) Speed up GSI analysis jobs in CI testing (NOAA-EMC#3115) Make aerosol output frequency variable (NOAA-EMC#2982) Add new stations to GFS BUFR sounding products (NOAA-EMC#3107) JCB-based obs+bias staging, Jedi class updates, and marine B-matrix refactoring (NOAA-EMC#2992) Enable tapering of atm ens perts at the model top (NOAA-EMC#3097) Update JGDAS ENKF POST job (NOAA-EMC#3090) SFS Runs at C96mx100 (NOAA-EMC#2960) Move machine-based options from config.base to host files (NOAA-EMC#3053) Remove RUNDIRS before running CI cases to cover re-run events (NOAA-EMC#3076) CI GitHub pipeline (hotfix) update for fetching repo name (NOAA-EMC#3084) Update JGDAS ENKF ECEN job (NOAA-EMC#3050) Update snow obs processing job (NOAA-EMC#3055) Update to action workflow pipeline in default repo for development (NOAA-EMC#3062) Update to action workflow pipeline in default repo for development (NOAA-EMC#3061) Update workflow pipeline (NOAA-EMC#3060) PW CI pipeline update5 ready for review so it can be merged and tested (NOAA-EMC#3059) Revert "GitHub CI Pipeline update for debugging forked PR support" (NOAA-EMC#3057) GitHub CI Pipeline update for debugging forked PR support (NOAA-EMC#3056) Add more ocean variables for post-processing in GEFS (NOAA-EMC#2995) Auto provisioning of PW clusters from GitHub CI added (NOAA-EMC#3051) Fix the name of the TC tracker filenames in archive.py (NOAA-EMC#3030) Make wxflow links static instead of from link_workflow (NOAA-EMC#3008) Update global jdas enkf diag job with COMIN/COMOUT for COM prefix (NOAA-EMC#2959) Add run and finalize methods to marine LETKF task (NOAA-EMC#2944) Fix wave restarts and GEFS FHOUT/FHMAX (NOAA-EMC#3009) Disabling hyper-threading (NOAA-EMC#2965) GitHub Actions Pipeline Updates for Self-Hosted Runners on PW (NOAA-EMC#3018) CI jekninsfile update hotfix (NOAA-EMC#3038) Update gdas.cd (NOAA-EMC#2978) Add ability to add tag to pslots with generate_workflows (NOAA-EMC#3036) CI update to shell environment with HOMEgfs to HOME_GFS for systems that need the path (NOAA-EMC#3013) Quick updated to Jenkins (health check) launch script (NOAA-EMC#3033) Document the generate_workflows.sh script (NOAA-EMC#3028) Replace gfs_cyc with an interval (NOAA-EMC#2928) Hotfix: Fix generate_workflows.sh optional build flags (NOAA-EMC#3024) Add a tool to run multiple YAML cases locally (NOAA-EMC#3004) Hotfix: Correctly set overwrite option when specified (NOAA-EMC#3021)
Description
To facilitate longer and more flexible GFS cadences, the
gfs_cyc
variable is replaced with a specified interval. Up front, this isreflected in a change in the arguments for setup_exp to:
Where
n_hours
is the interval (in hours) between gfs forecasts.n_hours
must be a multiple of 6. If 0, no gfs will be run (onlygdas; only valid for cycled mode). The default value is 6 (every cycle). (This is a change from current behavior of 24.)
In cycled mode, there is an additional argument to control which cycle will be the first gfs cycle:
The default if not provided is
--idate
+ 6h (first full cycle). This is the same as current behavior whengfs_cyc
is 6, but may vary from current behavior for other cadences.As part of this change, some of the validation of the dates has been added.
--edate
has also been made optional and defaults to--idate
if not provided.During
config.base
template-filling,INTERVAL_GFS
(renamed fromSTEP_GFS
) is defined as--interval
andSDATE_GFS as
--sdate_gfs`.Some changes were necessary to the gfs verification (metp) job, as
gfs_cyc
was being used downstream by verif-global. That has been removed, and instead workflow will be responsible for only running metp on the correct cycles. This also removes "do nothing" metp tasks that exit immediately, because only the last GFS cycle in a day would actually process verification.Now, metp has its own cycledef and will (a) always runs at 18z, regardless of whether gfs is running at 18z or not, if the interval is less than 24h; (b) use the same cycledef as gfs if the interval is 24h or greater. This is simpler than trying to determine the last gfs cycle of a day when it could change from day to day. To facilitate this change, support for the
undocumented rocoto dependency tag
taskvalid
is added, as the metp task needs to know whether the cycle has a gfsarch task or not. metp will trigger on gfsarch completing (as before), or look backwards for the last gfsarch to exist.Additionally, a couple EE2 issues with the metp job are resolved (even though it is not run in ops):
$CDUMP
with$RUN
$DATAROOT
is no longer redefined in the metp jobDepends on NOAA-EMC/EMC_verif-global#137
Resolves #260
Refs #1299
Type of change
Change characteristics
How has this been tested?
Checklist