Skip to content

Commit

Permalink
Enable forecast-only experiments on Hercules (NOAA-EMC#2128)
Browse files Browse the repository at this point in the history
This add forecast-only support for Hercules to the global workflow. Partially satisfies NOAA-EMC#1588.
Co-authored-by: Walter Kolczynski - NOAA <Walter.Kolczynski@noaa.gov>
  • Loading branch information
DavidHuber-NOAA authored Dec 7, 2023
1 parent a29f751 commit e2664c0
Show file tree
Hide file tree
Showing 27 changed files with 273 additions and 37 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ The `global-workflow` current supports the following tier-1 machines:

* NOAA RDHPCS - Hera
* MSU HPC - Orion
* MSU HPC - Hercules
* NOAA's operational HPC - WCOSS2

Additionally, the following tier-2 machine is supported:
Expand Down
2 changes: 1 addition & 1 deletion docs/note_fixfield.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ For EMC, the fix fields for running the model are not included in git repository
They are saved locally on all platforms

Hera: /scratch1/NCEPDEV/global/glopara/fix
Orion: /work/noaa/global/glopara/fix
Orion/Hercules: /work/noaa/global/glopara/fix
Jet: /mnt/lfs4/HFIP/hfv3gfs/glopara/git/fv3gfs/fix
S4: /data/prod/glopara/fix

Expand Down
4 changes: 2 additions & 2 deletions docs/source/clone.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ For cycled (w/ data assimilation) use the `-g` option during checkout:

For coupled cycling (include new UFSDA) use the `-gu` options during checkout:

[Currently only available on Hera and Orion]
[Currently only available on Hera, Orion, and Hercules]

::

Expand Down Expand Up @@ -110,7 +110,7 @@ Or with the ``-g`` switch to include data assimilation (GSI) for cycling:
./checkout.sh -g

Or also with the ``-u`` swtich to include coupled DA (via UFSDA):
[Currently only available on Hera and Orion]
[Currently only available on Hera, Orion, and Hercules]

::

Expand Down
2 changes: 1 addition & 1 deletion docs/source/components.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ Data
Observation data, also known as dump data, is prepared in production and then archived in a global dump archive (GDA) for use by users when running cycled experiments. The GDA (identified as ``$DMPDIR`` in the workflow) is available on supported platforms and the workflow system knows where to find the data.

* Hera: /scratch1/NCEPDEV/global/glopara/dump
* Orion: /work/noaa/rstprod/dump
* Orion/Hercules: /work/noaa/rstprod/dump
* Jet: /mnt/lfs4/HFIP/hfv3gfs/glopara/dump
* WCOSS2: /lfs/h2/emc/global/noscrub/emc.global/dump
* S4: /data/prod/glopara/dump
Expand Down
19 changes: 11 additions & 8 deletions docs/source/hpc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ HPC helpdesks
* WCOSS2: hpc.wcoss2-help@noaa.gov
* Hera: rdhpcs.hera.help@noaa.gov
* Orion: rdhpcs.orion.help@noaa.gov
* Hercules: rdhpcs.hercules.help@noaa.gov
* HPSS: rdhpcs.hpss.help@noaa.gov
* Gaea: oar.gfdl.help@noaa.gov
* S4: david.huber@noaa.gov
Expand Down Expand Up @@ -72,19 +73,21 @@ Version
It is advised to use Git v2+ when available. At the time of writing this documentation the default Git clients on the different machines were as noted in the table below. It is recommended that you check the default modules before loading recommended ones:

+---------+----------+---------------------------------------+
| Machine | Default | Recommended |
| Machine | Default | Recommended |
+---------+----------+---------------------------------------+
| Hera | v2.18.0 | default |
| Hera | v2.18.0 | default |
+---------+----------+---------------------------------------+
| Orion | v1.8.3.1 | **module load git/2.28.0** |
| Hercules | v2.31.1 | default |
+---------+----------+---------------------------------------+
| Jet | v2.18.0 | default |
| Orion | v1.8.3.1 | **module load git/2.28.0** |
+---------+----------+---------------------------------------+
| WCOSS2 | v2.26.2 | default or **module load git/2.29.0** |
| Jet | v2.18.0 | default |
+---------+----------+---------------------------------------+
| S4 | v1.8.3.1 | **module load git/2.30.0** |
| WCOSS2 | v2.26.2 | default or **module load git/2.29.0** |
+---------+----------+---------------------------------------+
| AWS PW | v1.8.3.1 | default
| S4 | v1.8.3.1 | **module load git/2.30.0** |
+---------+----------+---------------------------------------+
| AWS PW | v1.8.3.1 | default
+---------+----------+---------------------------------------+

^^^^^^^^^^^^^
Expand All @@ -103,7 +106,7 @@ For the manage_externals utility functioning::
Fix: git config --global ssh.variant ssh

========================================
Stacksize on R&Ds (Hera, Orion, Jet, S4)
Stacksize on R&Ds (Hera, Orion, Hercules, Jet, S4)
========================================

Some GFS components, like the UPP, need an unlimited stacksize. Add the following setting into your appropriate .*rc file to support these components:
Expand Down
12 changes: 6 additions & 6 deletions docs/source/init.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ Cold-start atmosphere-only cycled C96 deterministic C48 enkf (80 members) ICs ar
::

Hera: /scratch1/NCEPDEV/global/glopara/data/ICSDIR/C96C48
Orion: /work/noaa/global/glopara/data/ICSDIR/C96C48
Orion/Hercules: /work/noaa/global/glopara/data/ICSDIR/C96C48
WCOSS2: /lfs/h2/emc/global/noscrub/emc.global/data/ICSDIR/C96C48

Start date = 2021122018
Expand Down Expand Up @@ -108,7 +108,7 @@ Warm-start cycled w/ coupled (S2S) model C48 atmosphere C48 enkf (80 members) 5
::

Hera: /scratch1/NCEPDEV/global/glopara/data/ICSDIR/C48C48mx500
Orion: /work/noaa/global/glopara/data/ICSDIR/C48C48mx500
Orion/Hercules: /work/noaa/global/glopara/data/ICSDIR/C48C48mx500
WCOSS2: /lfs/h2/emc/global/noscrub/emc.global/data/ICSDIR/C48C48mx500
Jet: /lfs4/HFIP/hfv3gfs/glopara/data/ICSDIR/C48C48mx500

Expand Down Expand Up @@ -224,7 +224,7 @@ Forecast-only P8 prototype initial conditions are made available to users on sup

WCOSS2: /lfs/h2/emc/global/noscrub/emc.global/IC/COUPLED
HERA: /scratch1/NCEPDEV/climate/role.ufscpara/IC
ORION: /work/noaa/global/glopara/data/ICSDIR/prototype_ICs
ORION/Hercules: /work/noaa/global/glopara/data/ICSDIR/prototype_ICs
JET: /mnt/lfs4/HFIP/hfv3gfs/glopara/data/ICSDIR/prototype_ICs
S4: /data/prod/glopara/coupled_ICs

Expand Down Expand Up @@ -253,7 +253,7 @@ Not yet supported. See :ref:`Manual Generation<manual-generation>` section below
---------------------
Forecast-only coupled
---------------------
Coupled initial conditions are currently only generated offline and copied prior to the forecast run. Prototype initial conditions will automatically be used when setting up an experiment as an S2SW app, there is no need to do anything additional. Copies of initial conditions from the prototype runs are currently maintained on Hera, Orion, Jet, and WCOSS2. The locations used are determined by ``parm/config/config.coupled_ic``. If you need prototype ICs on another machine, please contact Walter (Walter.Kolczynski@noaa.gov).
Coupled initial conditions are currently only generated offline and copied prior to the forecast run. Prototype initial conditions will automatically be used when setting up an experiment as an S2SW app, there is no need to do anything additional. Copies of initial conditions from the prototype runs are currently maintained on Hera, Orion/Hercules, Jet, and WCOSS2. The locations used are determined by ``parm/config/config.coupled_ic``. If you need prototype ICs on another machine, please contact Walter (Walter.Kolczynski@noaa.gov).

.. _forecastonly-atmonly:

Expand Down Expand Up @@ -354,7 +354,7 @@ Then switch to a different tag or use the default branch (develop).
where ``$MACHINE`` is ``wcoss2``, ``hera``, or ``jet``.

.. note::
UFS-UTILS builds on Orion but due to the lack of HPSS access on Orion the ``gdas_init`` utility is not supported there.
UFS-UTILS builds on Orion/Hercules but due to the lack of HPSS access on Orion/Hercules the ``gdas_init`` utility is not supported there.

3. Configure your conversion:

Expand All @@ -380,7 +380,7 @@ Most users will want to adjust the following ``config`` settings for the current
where ``$MACHINE`` is currently ``wcoss2``, ``hera`` or ``jet``. Additional options will be available as support for other machines expands.

.. note::
UFS-UTILS builds on Orion but due to lack of HPSS access there is no ``gdas_init`` driver for Orion nor support to pull initial conditions from HPSS for the ``gdas_init`` utility.
UFS-UTILS builds on Orion/Hercules but due to lack of HPSS access there is no ``gdas_init`` driver for Orion/Hercules nor support to pull initial conditions from HPSS for the ``gdas_init`` utility.

Several small jobs will be submitted:

Expand Down
2 changes: 1 addition & 1 deletion docs/source/start.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Set up your experiment cron
^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. note::
Orion currently only supports cron on Orion-login-1. Cron support for other login nodes is coming in the future.
Orion and Hercules currently only support cron on Orion-login-1 and Hercules-login-1, respectively. Cron support for other login nodes is coming in the future.

::

Expand Down
69 changes: 69 additions & 0 deletions env/HERCULES.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
#! /usr/bin/env bash

if [[ $# -ne 1 ]]; then

echo "Must specify an input argument to set runtime environment variables!"
echo "argument can be any one of the following:"
echo "fcst post"
echo "Note: Hercules is only set up to run in forecast-only mode"
exit 1

fi

step=$1

export npe_node_max=40
export launcher="srun -l --export=ALL"
export mpmd_opt="--multi-prog --output=mpmd.%j.%t.out"

# Configure MPI environment
export MPI_BUFS_PER_PROC=2048
export MPI_BUFS_PER_HOST=2048
export MPI_GROUP_MAX=256
export MPI_MEMMAP_OFF=1
export MP_STDOUTMODE="ORDERED"
export KMP_AFFINITY=scatter
export OMP_STACKSIZE=2048000
export NTHSTACK=1024000000
#export LD_BIND_NOW=1

ulimit -s unlimited
ulimit -a

if [[ "${step}" = "waveinit" ]] || [[ "${step}" = "waveprep" ]] || [[ "${step}" = "wavepostsbs" ]] || \
[[ "${step}" = "wavepostbndpnt" ]] || [[ "${step}" = "wavepostpnt" ]] || [[ "${step}" == "wavepostbndpntbll" ]]; then

export CFP_MP="YES"
if [[ "${step}" = "waveprep" ]]; then export MP_PULSE=0 ; fi
export wavempexec=${launcher}
export wave_mpmd=${mpmd_opt}

elif [[ "${step}" = "fcst" ]]; then

export OMP_STACKSIZE=512M
if [[ "${CDUMP}" =~ "gfs" ]]; then
nprocs="npe_${step}_gfs"
ppn="npe_node_${step}_gfs" || ppn="npe_node_${step}"
else
nprocs="npe_${step}"
ppn="npe_node_${step}"
fi
(( nnodes = (${!nprocs}+${!ppn}-1)/${!ppn} ))
(( ntasks = nnodes*${!ppn} ))
# With ESMF threading, the model wants to use the full node
export APRUN_UFS="${launcher} -n ${ntasks}"
unset nprocs ppn nnodes ntasks

elif [[ "${step}" = "upp" ]]; then

nth_max=$((npe_node_max / npe_node_upp))

export NTHREADS_UPP=${nth_upp:-1}
[[ ${NTHREADS_UPP} -gt ${nth_max} ]] && export NTHREADS_UPP=${nth_max}
export APRUN_UPP="${launcher} -n ${npe_upp} --cpus-per-task=${NTHREADS_UPP}"

elif [[ "${step}" = "atmos_products" ]]; then

export USE_CFP="YES" # Use MPMD for downstream product generation

fi
17 changes: 12 additions & 5 deletions modulefiles/module-setup.csh.inc
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,18 @@ else if ( { test -d /scratch1 } ) then
source /apps/lmod/lmod/init/$__ms_shell
endif
module purge
else if ( { test -d /work } ) then
# We are on MSU Orion
if ( ! { module help >& /dev/null } ) then
source /apps/lmod/init/$__ms_shell
endif
elif [[ -d /work ]] ; then
# We are on MSU Orion or Hercules
if [[ -d /apps/other ]] ; then
# Hercules
init_path="/apps/other/lmod/lmod/init/$__ms_shell"
else
# Orion
init_path="/apps/lmod/lmod/init/$__ms_shell"
fi
if ( ! eval module help > /dev/null 2>&1 ) ; then
source "${init_path}"
fi
module purge
else if ( { test -d /data/prod } ) then
# We are on SSEC S4
Expand Down
11 changes: 9 additions & 2 deletions modulefiles/module-setup.sh.inc
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,16 @@ elif [[ -d /scratch1 ]] ; then
fi
module purge
elif [[ -d /work ]] ; then
# We are on MSU Orion
# We are on MSU Orion or Hercules
if [[ -d /apps/other ]] ; then
# Hercules
init_path="/apps/other/lmod/lmod/init/$__ms_shell"
else
# Orion
init_path="/apps/lmod/lmod/init/$__ms_shell"
fi
if ( ! eval module help > /dev/null 2>&1 ) ; then
source /apps/lmod/lmod/init/$__ms_shell
source "${init_path}"
fi
module purge
elif [[ -d /glade ]] ; then
Expand Down
46 changes: 46 additions & 0 deletions modulefiles/module_base.hercules.lua
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
help([[
Load environment to run GFS on Hercules
]])

spack_stack_ver=(os.getenv("spack_stack_ver") or "None")
spack_env=(os.getenv("spack_env") or "None")
prepend_path("MODULEPATH", "/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-" .. spack_stack_ver .. "/envs/" .. spack_env .. "/install/modulefiles/Core")

load(pathJoin("stack-intel", os.getenv("stack_intel_ver")))
load(pathJoin("stack-intel-oneapi-mpi", os.getenv("stack_impi_ver")))
load(pathJoin("python", os.getenv("python_ver")))

-- TODO load NCL once the SAs remove the 'depends_on' statements within it
-- NCL is a static installation and does not depend on any libraries
-- but as is will load, among others, the system netcdf-c/4.9.0 module
--load(pathJoin("ncl", os.getenv("ncl_ver")))
load(pathJoin("jasper", os.getenv("jasper_ver")))
load(pathJoin("libpng", os.getenv("libpng_ver")))
load(pathJoin("cdo", os.getenv("cdo_ver")))

load(pathJoin("hdf5", os.getenv("hdf5_ver")))
load(pathJoin("netcdf-c", os.getenv("netcdf_c_ver")))
load(pathJoin("netcdf-fortran", os.getenv("netcdf_fortran_ver")))

load(pathJoin("nco", os.getenv("nco_ver")))
load(pathJoin("prod_util", os.getenv("prod_util_ver")))
load(pathJoin("grib-util", os.getenv("grib_util_ver")))
load(pathJoin("g2tmpl", os.getenv("g2tmpl_ver")))
load(pathJoin("gsi-ncdiag", os.getenv("gsi_ncdiag_ver")))
load(pathJoin("crtm", os.getenv("crtm_ver")))
load(pathJoin("bufr", os.getenv("bufr_ver")))
load(pathJoin("wgrib2", os.getenv("wgrib2_ver")))
load(pathJoin("py-netcdf4", os.getenv("py_netcdf4_ver")))
load(pathJoin("py-pyyaml", os.getenv("py_pyyaml_ver")))
load(pathJoin("py-jinja2", os.getenv("py_jinja2_ver")))

setenv("WGRIB2","wgrib2")
setenv("UTILROOT",(os.getenv("prod_util_ROOT") or "None"))

prepend_path("MODULEPATH", pathJoin("/work/noaa/global/glopara/git/prepobs/feature-GFSv17_com_reorg_log_update/modulefiles"))
load(pathJoin("prepobs", os.getenv("prepobs_run_ver")))

prepend_path("MODULEPATH", pathJoin("/work/noaa/global/glopara/git/Fit2Obs/v" .. (os.getenv("fit2obs_ver") or "None"), "modulefiles"))
load(pathJoin("fit2obs", os.getenv("fit2obs_ver")))

whatis("Description: GFS run environment")
15 changes: 15 additions & 0 deletions modulefiles/module_gwci.hercules.lua
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
help([[
Load environment to run GFS workflow ci scripts on Hercules
]])

prepend_path("MODULEPATH", "/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/envs/gsi-addon/install/modulefiles/Core")

load(pathJoin("stack-intel", os.getenv("2021.9.0")))
load(pathJoin("stack-intel-oneapi-mpi", os.getenv("2021.9.0")))

load(pathJoin("netcdf-c", os.getenv("4.9.2")))
load(pathJoin("netcdf-fortran", os.getenv("4.6.0")))
load(pathJoin("nccmp","1.9.0.1"))
load(pathJoin("wgrib2", "3.1.1"))

whatis("Description: GFS run ci top-level sripts environment")
19 changes: 19 additions & 0 deletions modulefiles/module_gwsetup.hercules.lua
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
help([[
Load environment to run GFS workflow ci scripts on Hercules
]])

load(pathJoin("contrib","0.1"))
load(pathJoin("rocoto","1.3.5"))

prepend_path("MODULEPATH", "/work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/envs/gsi-addon/install/modulefiles/Core")

local stack_intel_ver=os.getenv("stack_intel_ver") or "2021.9.0"
local python_ver=os.getenv("python_ver") or "3.10.8"

load(pathJoin("stack-intel", stack_intel_ver))
load(pathJoin("python", python_ver))
load("py-jinja2")
load("py-pyyaml")
load("py-numpy")

whatis("Description: GFS run setup environment")
2 changes: 2 additions & 0 deletions parm/config/gefs/config.resources
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,8 @@ elif [[ ${machine} = "S4" ]]; then
fi
elif [[ ${machine} = "ORION" ]]; then
export npe_node_max=40
elif [[ ${machine} = "HERCULES" ]]; then
export npe_node_max=40
fi

if [[ ${step} = "prep" ]]; then
Expand Down
2 changes: 1 addition & 1 deletion parm/config/gefs/config.ufs
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ case "${machine}" in
"WCOSS2")
npe_node_max=128
;;
"HERA" | "ORION")
"HERA" | "ORION" | "HERCULES" )
npe_node_max=40
;;
"JET")
Expand Down
2 changes: 1 addition & 1 deletion parm/config/gfs/config.aero
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ case ${machine} in
"HERA")
AERO_INPUTS_DIR="/scratch1/NCEPDEV/global/glopara/data/gocart_emissions"
;;
"ORION")
"ORION" | "HERCULES")
AERO_INPUTS_DIR="/work2/noaa/global/wkolczyn/noscrub/global-workflow/gocart_emissions"
;;
"S4")
Expand Down
2 changes: 2 additions & 0 deletions parm/config/gfs/config.resources
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@ elif [[ "${machine}" = "AWSPW" ]]; then
export npe_node_max=40
elif [[ ${machine} = "ORION" ]]; then
export npe_node_max=40
elif [[ ${machine} = "HERCULES" ]]; then
export npe_node_max=40
fi

if [[ ${step} = "prep" ]]; then
Expand Down
2 changes: 1 addition & 1 deletion parm/config/gfs/config.ufs
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ case "${machine}" in
"WCOSS2")
npe_node_max=128
;;
"HERA" | "ORION")
"HERA" | "ORION" | "HERCULES")
npe_node_max=40
;;
"JET")
Expand Down
Loading

0 comments on commit e2664c0

Please sign in to comment.