Skip to content

Consolidating OpenACC device-host memory transfers #1315

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

abishekg7
Copy link
Collaborator

This PR consolidates much of the OpenACC host and device data transfers during the course of the dynamical execution to two subroutines mpas_atm_pre_dynamics_h2d and mpas_atm_post_dynamics_d2h that are called before and after the call to atm_srk3 subroutine. Due to atm_compute_solve_diagnostics also being called once before the start of model run, we also have a pair of subroutines mpas_atm_pre_computesolvediag_h2d and mpas_atm_post_computesolvediag_d2h to handle data movements around the first call to atm_compute_solve_diagnostics. Any fields copied onto the device in these subroutines are removed from explicit data movement statements in the dynamical core.

The mesh/time-invariant fields are still copied onto the device in mpas_atm_dynamics_init and removed from the device in mpas_atm_dynamics_finalize, with the exception of select fields moved in mpas_atm_pre_computesolvediag_h2d and mpas_atm_post_computesolvediag_d2h. This is a special case due to atm_compute_solve_diagnostics being called for the first time before the call to mpas_atm_dynamics_init

This PR also includes explicit host-device data transfers in the mpas_atm_iau, mpas_atmphys_interface and mpas_atmphys_todynamics modules to ensure that the physics and IAU regions, which run on CPU, use the latest values from the dynamical core running on GPUs, and vice versa. In addition, this PR also includes explicit data transfers around halo exchanges in the atm_srk3 subroutine.

These subroutines for data routines, and the acc update statements are an interim solution until we have a book-keeping method in place.

This PR also introduces a couple of new timers to keep track of the cost of data transfers.

This PR consolidates much of the OpenACC host and device data transfers during
the course of the dynamical execution to two subroutines mpas_atm_pre_dynamics
_h2d and mpas_atm_post_dynamics_d2h that are called before and after the call
to atm_srk3 subroutine. Due to atm_compute_solve_diagnostics also being called
once before the start of model run, we also have a pair of subroutines mpas_atm
_pre_computesolvediag_h2d and mpas_atm_post_computesolvediag_d2h to handle data
movements around the first call to atm_compute_solve_diagnostics. Any fields
copied onto the device in these subroutines are removed from explicit data
movement statements in the dynamical core.

The mesh/time-invariant fields are still copied onto the device in mpas_atm_
dynamics_init and removed from the device in mpas_atm_dynamics_finalize, with
the exception of select fields moved in mpas_atm_pre_computesolvediag_h2d and
mpas_atm_post_computesolvediag_d2h. This is a special case due to atm_compute_
solve_diagnostics being called for the first time before the call to mpas_atm_
dynamics_init

This PR also includes explicit host-device data transfers in the mpas_atm_iau,
mpas_atmphys_interface and mpas_atmphys_todynamics modules to ensure that the
physics and IAU regions, which run on CPU, use the latest values from the
dynamical core running on GPUs, and vice versa. In addition, this PR also
includes explicit data transfers around halo exchanges in the atm_srk3
subroutine.

These subroutines for data routines, and the acc update statements are an
interim solution until we have a book-keeping method in place. This PR also
introduces a couple of new timers to keep track of the cost of data transfers.
@abishekg7 abishekg7 force-pushed the atmosphere/acc_mem_move_per_timestep branch from ac98504 to 4845ce2 Compare May 16, 2025 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant