Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use full MPI library for mpi-serial tests #2497

Open
ekluzek opened this issue Apr 26, 2024 · 1 comment
Open

Use full MPI library for mpi-serial tests #2497

ekluzek opened this issue Apr 26, 2024 · 1 comment
Labels
bfb bit-for-bit blocked: dependency Wait to work on this until dependency is resolved code health improving internal code structure to make easier to maintain (sustainability) enhancement new capability or improved behavior of existing capability good first issue simple; good for first-time contributors testing additions or changes to tests

Comments

@ekluzek
Copy link
Collaborator

ekluzek commented Apr 26, 2024

As discussed in CSEG we want to remove the use of mpi-serial in our tests and simulations. This is largely because modern MPI libraries allow you to link with the MPI library but still run serially WITHOUT using mpirun (mpiexec, mpibind, or any of the other flavors).

This depends on getting the mpirun update in cime here:

ESMCI/cime#4619

Or doing this explicitly for Derecho and Izumi in ccs_config.

@ekluzek ekluzek added enhancement new capability or improved behavior of existing capability code health improving internal code structure to make easier to maintain (sustainability) tag: simple bfb testing additions or changes to tests blocked: dependency Wait to work on this until dependency is resolved next this should get some attention in the next week or two. Normally each Thursday SE meeting. labels Apr 26, 2024
@ekluzek ekluzek added this to the cesm2_3_beta18 milestone Apr 26, 2024
@ekluzek
Copy link
Collaborator Author

ekluzek commented Apr 29, 2024

As pointed out by @wwieder outside of the test lists we also have these settings:

NEON/FATES/defaults/shell_commands:# Explicitly set the MPI library to mpi-serial so won't have the build/run complexity of a full MPI library
NEON/FATES/defaults/shell_commands:./xmlchange MPILIB=mpi-serial
NEON/defaults/shell_commands:# Explicitly set the MPI library to mpi-serial so won't have the build/run complexity of a full MPI library
NEON/defaults/shell_commands:./xmlchange MPILIB=mpi-serial
PLUMBER2/defaults/shell_commands:# Explicitly set the MPI library to mpi-serial so won't have the build/run complexity of a full MPI library
PLUMBER2/defaults/shell_commands:./xmlchange MPILIB=mpi-serial

And under python code:

ctsm/site_and_regional/single_point_case.py:            self.write_to_file("./xmlchange MPILIB=mpi-serial", nl_file)
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:      <command name="load">mpi-serial/2.3.0</command>
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="TRUE" compiler="intel" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="FALSE" compiler="intel" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="TRUE" compiler="gnu" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="FALSE" compiler="gnu" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="TRUE" compiler="pgi" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="FALSE" compiler="pgi" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="TRUE" compiler="nvhpc" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules DEBUG="FALSE" compiler="nvhpc" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules compiler="gnu" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules compiler="intel" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules compiler="pgi" mpilib="mpi-serial">
ctsm/test/testinputs/mksurfdata_esmf_bld/env_mach_specific.xml:    <modules compiler="nvhpc" mpilib="mpi-serial">

And in the documentation under doc/source

lilac/specific-atm-models/wrf-tools.rst:     ../../../configure --macros-format Makefile --mpilib mpi-serial
users_guide/running-single-points/running-pts_mode-configurations.rst:Note, that when running with ``PTS_MODE`` the number of processors is automatically set to one. When running a single grid point you can only use a single processor. You might also want to set the ``env_build.xml`` variable: ``MPILIB=mpi-serial`` to ``TRUE`` so that you can also run interactively without having to use MPI to start up your job.
users_guide/running-single-points/running-single-point-configurations.rst:   Just like ``PTS_MODE`` (Sect. :numref:`pts_mode`), by default these setups sometimes run with ``MPILIB=mpi-serial`` (in the ``env_build.xml`` file) turned on, which allows you to run the model interactively. On some machines this mode is NOT supported and you may need to change it to FALSE before you are able to build.
users_guide/trouble-shooting/trouble-shooting.rst:Simplifying to one processor removes all multi-processing problems and makes the case as simple as possible. If you can enable ``MPILIB=mpi-serial`` you will also be able to run interactively rather than having to submit to a job queue, which sometimes makes it easier to run and debug. If you can use ``MPILIB=mpi-serial`` you can also use threading, but still run interactively in order to use more processors to make it faster if needed.
users_guide/trouble-shooting/trouble-shooting.rst:   # set MPILIB to mpi-serial so that you can run interactively
users_guide/trouble-shooting/trouble-shooting.rst:   > ./xmlchange MPILIB=mpi-serial

@ekluzek ekluzek removed the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Jul 8, 2024
@samsrabin samsrabin added simple bfb bit-for-bit and removed simple bfb labels Aug 8, 2024
@samsrabin samsrabin added good first issue simple; good for first-time contributors and removed simple labels Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bfb bit-for-bit blocked: dependency Wait to work on this until dependency is resolved code health improving internal code structure to make easier to maintain (sustainability) enhancement new capability or improved behavior of existing capability good first issue simple; good for first-time contributors testing additions or changes to tests
Projects
None yet
Development

No branches or pull requests

2 participants