Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature CUDA compatability #228

Closed
wants to merge 75 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
c305133
First steps towards adding CI for Fortran frontend + CUDA kernels
RobertPincus Jun 5, 2023
2134007
Use existing branch in own repo to get CUDA kernels
RobertPincus Jun 5, 2023
f76a29d
YML syntax
RobertPincus Jun 5, 2023
ccef891
YML syntax
RobertPincus Jun 5, 2023
c61e9b2
matrix
RobertPincus Jun 5, 2023
43f2cb5
No mapping
RobertPincus Jun 5, 2023
9ba1a0e
Where the code at?
RobertPincus Jun 5, 2023
fc517f5
Over there?
RobertPincus Jun 5, 2023
2c603bd
Maybe it's here...
RobertPincus Jun 5, 2023
5311508
No, I mean here
RobertPincus Jun 5, 2023
36777eb
Syntax
RobertPincus Jun 5, 2023
6e7d22b
Over there?
RobertPincus Jun 5, 2023
10247cc
Different checkout actions keyword
RobertPincus Jun 5, 2023
f738782
... and now the location
RobertPincus Jun 5, 2023
5a5841f
Modules
RobertPincus Jun 5, 2023
237b11f
Module, flags
RobertPincus Jun 5, 2023
a289673
Flags?
RobertPincus Jun 5, 2023
24c42fd
Revisit Fortran flags used in CUDA compilation
RobertPincus Jun 5, 2023
e6dfc4b
Build combo and separate libraries by default
RobertPincus Jun 6, 2023
2066f70
Remove netCDF C dependencies, allow Makefile vars to be overridden by…
RobertPincus Jun 6, 2023
401d49b
NCHOME is gone; -L$(NFHOME) in Makefiles remains because of custom in…
RobertPincus Jun 6, 2023
067d831
Get all arrays on device before calling kernels
RobertPincus Jun 7, 2023
9d8041a
Arrays in ty_optical_props on device
RobertPincus Jun 7, 2023
9d7875b
Make that all the arrays
RobertPincus Jun 7, 2023
cecd8fd
Delete device memory on finalization in ty_optical_props
RobertPincus Jun 7, 2023
b4e88e9
Some inquiry functions use pointers not array copies
RobertPincus Jun 7, 2023
a2b201e
Missed one
RobertPincus Jun 7, 2023
ea5b3f9
Change build order, use CUDA kernels where possible
RobertPincus Jun 7, 2023
36593d0
Russa frussa YML syntax
RobertPincus Jun 7, 2023
5926693
Using CUDA kernels needs OpenACC enabled for Fortran side
RobertPincus Jun 7, 2023
6c06e5d
Structured data statements
RobertPincus Jun 7, 2023
5c05ce1
Add extern/ subdirectories to kernels; these contain only interfaces …
RobertPincus Jun 8, 2023
88538f9
Interfaces added for rte-kernels. Directory name means they won't get…
RobertPincus Jun 8, 2023
8dfaf8e
Debugging print statements
RobertPincus Jun 9, 2023
9b8b1d9
Syntax
RobertPincus Jun 9, 2023
1786b15
Omit some debugging, more structured data statements
RobertPincus Jun 9, 2023
3134a95
No return from within OMP structued data
RobertPincus Jun 9, 2023
63a14a8
Last failed, add some prints back
RobertPincus Jun 9, 2023
7eb63aa
One...
RobertPincus Jun 9, 2023
276872a
Two (plus library order - for CCE?)
RobertPincus Jun 9, 2023
faf7698
Revert build order
RobertPincus Jun 9, 2023
39ec684
Three
RobertPincus Jun 9, 2023
b1f48c2
Three died, what about four? And refine library order again
RobertPincus Jun 9, 2023
b3e7f10
Library order everywhere, found the important print statement
RobertPincus Jun 9, 2023
721670b
Libraries for CUDA kernels
RobertPincus Jun 9, 2023
6115ade
Copy weights to GPU, use CUDA RTE kernels
RobertPincus Jun 15, 2023
14b0e66
Send the right arrays to the device...
RobertPincus Jun 15, 2023
61b6611
Can't copy files you don't make.
RobertPincus Jun 15, 2023
9f17cde
Naming of kernel libraries
RobertPincus Jun 16, 2023
6582126
Directory doesn't stick in Make?
RobertPincus Jun 16, 2023
33acad4
Updates from develop branch
RobertPincus Jun 21, 2023
6bf03ae
Parameter array on device, don't branch from structured data statement
RobertPincus Jun 22, 2023
3cbfb16
Use new standalone repository for CUDA kernels
RobertPincus Jun 23, 2023
6335f19
Merge branch 'develop' into feature-cuda-compatability
RobertPincus Jul 4, 2023
8d16215
Merge branch 'develop' into feature-cuda-compatability
RobertPincus Jul 13, 2023
e94118c
Merge branch 'develop' into feature-cuda-compatability
RobertPincus Sep 19, 2023
a67f30a
Reverting Github checkout action to v3?
RobertPincus Sep 19, 2023
55bde89
Merge branch 'develop' into feature-cuda-compatability
RobertPincus Sep 30, 2023
e14a7ab
Merge branch 'develop' into feature-cuda-compatability
RobertPincus Feb 27, 2024
583800d
Reversing error from merge
RobertPincus Feb 27, 2024
6e77b98
Make front-end/kernel libraries by default
RobertPincus Feb 28, 2024
6c03fbb
Build default libraries in self-hosted CI
RobertPincus Feb 28, 2024
cb36c36
Refine Makefile
RobertPincus Feb 28, 2024
494f319
Remove old macros
RobertPincus Feb 29, 2024
f5a1fad
device memory?
RobertPincus Feb 29, 2024
d930258
Fewer specifications for Openacc?
RobertPincus Feb 29, 2024
f3898f2
Missed changing lw 2-stream solver interface
RobertPincus Feb 29, 2024
b69bb6e
Syntax
RobertPincus Feb 29, 2024
70ebb31
Revert one change, extend another to third array
RobertPincus Feb 29, 2024
8da9b32
Use finalize instead of repeating code
RobertPincus Mar 1, 2024
0b00b9b
Rename interface directories
RobertPincus Apr 2, 2024
dcdd966
Update Makefiles
RobertPincus Apr 2, 2024
4f1727b
Migrating C headers
RobertPincus Apr 2, 2024
9943839
Merge branch 'develop' into feature-cuda-compatability
RobertPincus Apr 2, 2024
0834520
Library order in Makefiles
RobertPincus Apr 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 27 additions & 2 deletions .github/workflows/self-hosted-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,13 @@ jobs:
compiler-modules: "PrgEnv-cray craype-accel-nvidia60 cdt-cuda/22.05 cudatoolkit/11.2.0_3.39-2.1__gf93aa1c"
# OpenMP flags from Nichols Romero (Argonne)
fcflags: "-hnoacc -homp -O0"
- config-name: cuda-kernels
# Fall back to OpenACC
rte-kernels: extern
compiler-modules: "PrgEnv-nvidia nvidia craype-accel-nvidia60 cdt-cuda/21.09 cudatoolkit/11.2.0_3.39-2.1__gf93aa1c !cray-libsci_acc"
fcflags: "-g -O3 -acc -gopt -Mallocatable=03 -Mpreprocess -Minfo"
experimental: true

env:
# Core variables:
FC: ftn
Expand All @@ -53,7 +60,8 @@ jobs:
#
# Checks-out repository under $GITHUB_WORKSPACE
#
- uses: actions/checkout@v3
- name: Check out Fortran code
uses: actions/checkout@v3
#
# Check out data
#
Expand All @@ -63,6 +71,15 @@ jobs:
repository: earth-system-radiation/rrtmgp-data
path: rrtmgp-data
#
# Check out CUDA kernels if needed
#
- name: Check out CUDA kernels
if: matrix.config-name == 'cuda-kernels'
uses: actions/checkout@v3
with:
repository: earth-system-radiation/rte-rrtmgp-cuda-kernels
path: cuda-kernels
#
# Finalize build environment
#
- name: Finalize build environment
Expand Down Expand Up @@ -90,14 +107,22 @@ jobs:
# SLURM jobs, user home directories and HDF5 file locking are
# incompatible on Daint:
echo 'HDF5_USE_FILE_LOCKING=FALSE' >> "${GITHUB_ENV}"
#
# Build libraries, examples and tests
#
- name: Build libraries
run: |
$FC --version
make -j8 libs
#
# Build library of CUDA kernels; copy to build directory and overwrite defaults
#
- name: Build CUDA kernels
if: matrix.config-name == 'cuda-kernels'
run: |
make -C cuda-kernels/build
cp cuda-kernels/build/librtecudakernels.a build/librtekernels.a
cp cuda-kernels/build/librrtmgpcudakernels.a build/librrtmgpkernels.a
#
# Run examples and tests (expect success)
#
- name: Build and run examples and tests (expect success)
Expand Down
2 changes: 1 addition & 1 deletion examples/all-sky/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ RRTMGP_BUILD = $(RRTMGP_ROOT)/build
# RRTMGP library, module files
#
LDFLAGS += -L$(RRTMGP_BUILD)
LIBS += -lrrtmgp -lrte
LIBS += -lrrtmgpf -lrtef -lrrtmgpkernels -lrtekernels
FCINCLUDE += -I$(RRTMGP_BUILD)

# netcdf Fortran module files has to be in the search path or added via environment variable FCINCLUDE e.g.
Expand Down
2 changes: 1 addition & 1 deletion examples/rfmip-clear-sky/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ RRTMGP_BUILD = $(RRTMGP_ROOT)/build
# RRTMGP library, module files
#
LDFLAGS += -L$(RRTMGP_BUILD)
LIBS += -lrrtmgp -lrte
LIBS += -lrrtmgpf -lrtef -lrrtmgpkernels -lrtekernels
FCINCLUDE += -I$(RRTMGP_BUILD)

# netcdf Fortran module files has to be in the search path or added via environment variable FCINCLUDE e.g.
Expand Down
27 changes: 18 additions & 9 deletions rrtmgp-frontend/mo_gas_optics_rrtmgp.F90
Original file line number Diff line number Diff line change
Expand Up @@ -260,8 +260,10 @@ function gas_optics_int(this, &
!
! Gas optics
!
!$acc enter data create(jtemp, jpress, tropo, fmajor, jeta)
!$omp target enter data map(alloc:jtemp, jpress, tropo, fmajor, jeta)
!$acc enter data copyin(play, plev, tlay)
!$omp target enter data map(to:play, plev, tlay)
!$acc enter data create( jtemp, jpress, tropo, fmajor, jeta)
!$omp target enter data map(alloc:jtemp, jpress, tropo, fmajor, jeta)
error_msg = compute_gas_taus(this, &
ncol, nlay, ngpt, nband, &
play, plev, tlay, gas_desc, &
Expand All @@ -274,7 +276,7 @@ function gas_optics_int(this, &
! External source -- check arrays sizes and values
! input data sizes and values
!
!$acc enter data copyin(tsfc, tlev) ! Should be fine even if tlev is not supplied
!$acc enter data copyin(tsfc, tlev) ! Should be fine even if tlev is not supplied
!$omp target enter data map(to:tsfc, tlev)

if(check_extents) then
Expand Down Expand Up @@ -328,6 +330,8 @@ function gas_optics_int(this, &
!$omp target exit data map(release:tsfc)
!$acc exit data delete(jtemp, jpress, tropo, fmajor, jeta)
!$omp target exit data map(release:jtemp, jpress, tropo, fmajor, jeta)
!$acc exit data delete(play, plev, tlay)
!$omp target exit data map(release:play, plev, tlay)
end function gas_optics_int
!------------------------------------------------------------------------------------------
!
Expand Down Expand Up @@ -373,23 +377,27 @@ function gas_optics_ext(this, &
!
! Gas optics
!
!$acc enter data create(jtemp, jpress, tropo, fmajor, jeta)
!$omp target enter data map(alloc:jtemp, jpress, tropo, fmajor, jeta)
!$acc data copyin(play, plev, tlay)
!$omp target data map(to:play, plev, tlay)
!$acc data create( jtemp, jpress, tropo, fmajor, jeta)
!$omp target data map(alloc:jtemp, jpress, tropo, fmajor, jeta)
error_msg = compute_gas_taus(this, &
ncol, nlay, ngpt, nband, &
play, plev, tlay, gas_desc, &
optical_props, &
jtemp, jpress, jeta, tropo, fmajor, &
col_dry)
!$acc exit data delete(jtemp, jpress, tropo, fmajor, jeta)
!$omp target exit data map(release:jtemp, jpress, tropo, fmajor, jeta)
!$acc end data
!$omp end target data
!$acc end data
!$omp end target data
if(error_msg /= '') return

! ----------------------------------------------------------
!
! External source function is constant
!
!$acc enter data create(toa_src)
!$acc enter data create( toa_src)
!$omp target enter data map(alloc:toa_src)
if(check_extents) then
if(.not. extents_are(toa_src, ncol, ngpt)) &
Expand All @@ -404,7 +412,7 @@ function gas_optics_ext(this, &
toa_src(icol,igpt) = this%solar_source(igpt)
end do
end do
!$acc exit data copyout(toa_src)
!$acc exit data copyout( toa_src)
!$omp target exit data map(from:toa_src)
end function gas_optics_ext
!------------------------------------------------------------------------------------------
Expand Down Expand Up @@ -710,6 +718,7 @@ function compute_gas_taus(this, &
ncol, nlay, ngpt, optical_props%p)
end select
end if
! Interpolation coefficients copyout
!$acc end data
!$omp end target data
if (present(col_dry)) then
Expand Down
11 changes: 11 additions & 0 deletions rrtmgp-kernels/api/mo_gas_optics_rrtmgp_kernels.F90
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
! This code is part of
! RRTM for GCM Applications - Parallel (RRTMGP)
!
! Contact: Eli Mlawer and Robert Pincus
! email: rrtmgp@aer.com
!
! Copyright 2015-, Atmospheric and Environmental Research and
! Regents of the University of Colorado. All right reserved.
!
! Use and duplication is permitted under the terms of the
! BSD 3-clause license, see http://opensource.org/licenses/BSD-3-Clause
module mo_gas_optics_rrtmgp_kernels
use mo_rte_kind, only : wp, wl
use mo_rte_util_array,only : zero_array
Expand Down
48 changes: 26 additions & 22 deletions rte-frontend/mo_optical_props.F90
Original file line number Diff line number Diff line change
Expand Up @@ -275,8 +275,7 @@ function init_base(this, band_lims_wvn, band_lims_gpt, name) result(err_message)
!
! Assignment
!
if(allocated(this%band2gpt )) deallocate(this%band2gpt)
if(allocated(this%band_lims_wvn)) deallocate(this%band_lims_wvn)
call this%finalize_base()
allocate(this%band2gpt (2,size(band_lims_wvn,2)), &
this%band_lims_wvn(2,size(band_lims_wvn,2)))
this%band2gpt = band_lims_gpt_lcl
Expand All @@ -287,11 +286,13 @@ function init_base(this, band_lims_wvn, band_lims_gpt, name) result(err_message)
! Make a map between g-points and bands
! Efficient only when g-point indexes start at 1 and are contiguous.
!
if(allocated(this%gpt2band)) deallocate(this%gpt2band)
allocate(this%gpt2band(maxval(band_lims_gpt_lcl)))
do iband=1,size(band_lims_gpt_lcl,dim=2)
this%gpt2band(band_lims_gpt_lcl(1,iband):band_lims_gpt_lcl(2,iband)) = iband
end do
!$acc enter data copyin(this%band2gpt, this%gpt2band, this%band_lims_wvn)
!$omp target enter data map(to:this%band2gpt, this%gpt2band, this%band_lims_wvn)

end function init_base
!-------------------------------------------------------------------------------------------------
function init_base_from_copy(this, spectral_desc) result(err_message)
Expand Down Expand Up @@ -326,6 +327,8 @@ end function is_initialized_base
subroutine finalize_base(this)
class(ty_optical_props), intent(inout) :: this

!$acc exit data delete( this%band2gpt, this%gpt2band, this%band_lims_wvn)
!$omp target exit data map(release:this%band2gpt, this%gpt2band, this%band_lims_wvn)
if(allocated(this%band2gpt)) deallocate(this%band2gpt)
if(allocated(this%gpt2band)) deallocate(this%gpt2band)
if(allocated(this%band_lims_wvn)) &
Expand Down Expand Up @@ -1094,12 +1097,15 @@ end function get_ngpt
!> The first and last g-point of all bands at once
!> dimension (2, nbands)
!>
pure function get_band_lims_gpoint(this)
class(ty_optical_props), intent(in) :: this
integer, dimension(size(this%band2gpt,dim=1), size(this%band2gpt,dim=2)) &
:: get_band_lims_gpoint
function get_band_lims_gpoint(this)
class(ty_optical_props), target, intent(in) :: this
integer, dimension(:,:), pointer :: get_band_lims_gpoint

get_band_lims_gpoint = this%band2gpt
if(this%is_initialized()) then
get_band_lims_gpoint => this%band2gpt
else
get_band_lims_gpoint => NULL()
end if
end function get_band_lims_gpoint
!>--------------------------------------------------------------------------------------------------------------------
!>
Expand All @@ -1121,15 +1127,14 @@ end function convert_band2gpt
!> Lower and upper wavenumber of all bands
!> (upper and lower wavenumber by band) = band_lims_wvn(2,band)
!>
pure function get_band_lims_wavenumber(this)
class(ty_optical_props), intent(in) :: this
real(wp), dimension(size(this%band_lims_wvn,1), size(this%band_lims_wvn,2)) &
:: get_band_lims_wavenumber
function get_band_lims_wavenumber(this)
class(ty_optical_props), target, intent(in) :: this
real(wp), dimension(:,:), pointer :: get_band_lims_wavenumber

if(this%is_initialized()) then
get_band_lims_wavenumber(:,:) = this%band_lims_wvn(:,:)
get_band_lims_wavenumber => this%band_lims_wvn
else
get_band_lims_wavenumber(:,:) = 0._wp
get_band_lims_wavenumber => NULL()
end if
end function get_band_lims_wavenumber
!>--------------------------------------------------------------------------------------------------------------------
Expand All @@ -1151,15 +1156,14 @@ end function get_band_lims_wavelength
!> Bands for all the g-points at once
!> dimension (ngpt)
!>
pure function get_gpoint_bands(this)
class(ty_optical_props), intent(in) :: this
integer, dimension(size(this%gpt2band,dim=1)) &
:: get_gpoint_bands
function get_gpoint_bands(this)
class(ty_optical_props), target, intent(in) :: this
integer, dimension(:), pointer :: get_gpoint_bands

if(this%is_initialized()) then
get_gpoint_bands(:) = this%gpt2band(:)
get_gpoint_bands => this%gpt2band
else
get_gpoint_bands(:) = 0
get_gpoint_bands => NULL()
end if
end function get_gpoint_bands
!>--------------------------------------------------------------------------------------------------------------------
Expand Down Expand Up @@ -1196,7 +1200,7 @@ end function expand
!>
!> Are the bands of two objects the same? (same number, same wavelength limits)
!>
pure function bands_are_equal(this, that)
function bands_are_equal(this, that)
class(ty_optical_props), intent(in) :: this, that
logical :: bands_are_equal

Expand All @@ -1212,7 +1216,7 @@ end function bands_are_equal
!> Is the g-point structure of two objects the same?
!> (same bands, same number of g-points, same mapping between bands and g-points)
!>
pure function gpoints_are_equal(this, that)
function gpoints_are_equal(this, that)
class(ty_optical_props), intent(in) :: this, that
logical :: gpoints_are_equal

Expand Down
3 changes: 2 additions & 1 deletion rte-frontend/mo_rte_lw.F90
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,7 @@ function rte_lw(optical_props, top_at_1, &
0.2009319137_wp, 0.2292411064_wp, 0.0698269799_wp, 0._wp, &
0.1355069134_wp, 0.2034645680_wp, 0.1298475476_wp, 0.0311809710_wp], &
[max_gauss_pts, max_gauss_pts])
!$acc declare create(gauss_Ds, gauss_wts)
! ------------------------------------------------------------------------------------
ncol = optical_props%get_ncol()
nlay = optical_props%get_nlay()
Expand Down Expand Up @@ -310,7 +311,7 @@ function rte_lw(optical_props, top_at_1, &
! Compute the radiative transfer...
!
!$acc data create( sfc_emis_gpt, flux_up_loc, flux_dn_loc, gpt_flux_up, gpt_flux_dn)
!$omp target data map(alloc:sfc_emis_gpt, flux_up_loc, flux_dn_loc, gpt_flux_up, gpt_flux_dn)
!$omp target data map(alloc:sfc_emis_gpt, flux_up_loc, flux_dn_loc, gpt_flux_up, gpt_flux_dn) map(to:gauss_wts, gauss_Ds)
call expand_and_transpose(optical_props, sfc_emis, sfc_emis_gpt)
if(check_values) error_msg = optical_props%validate()
if(len_trim(error_msg) == 0) then ! Can't do an early return within OpenACC/MP data regions
Expand Down
2 changes: 1 addition & 1 deletion tests/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ RRTMGP_BUILD = $(RRTMGP_ROOT)/build
# RRTMGP library, module files
#
LDFLAGS += -L$(RRTMGP_BUILD)
LIBS += -lrrtmgp -lrte
LIBS += -lrrtmgpf -lrtef -lrrtmgpkernels -lrtekernels
FCINCLUDE += -I$(RRTMGP_BUILD)

# netcdf Fortran module files has to be in the search path or added via environment variable FCINCLUDE e.g.
Expand Down
1 change: 1 addition & 0 deletions tests/check_variants.F90
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,7 @@ program rte_clear_sky_regression
!
call stop_on_err(gas_conc_array(1)%get_subset(1, ncol, gas_concs))
call gas_conc_array(1)%reset()
print *, "Reset gas concs" ! Without this line there's an OpenACC error
deallocate(gas_conc_array)
! ----------------------------------------------------------------------------
! load data into classes
Expand Down
3 changes: 1 addition & 2 deletions tests/intel-codecov.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ export PROF_DIR=$PWD
# Environment variables for netCDF Fortran and C installations
#
export NFHOME=${HOME}/Applications/${FC}
export NCHOME=/opt/local

#
# An Anaconda environent with modules needed for other python scripts
Expand All @@ -32,7 +31,7 @@ make -C ${RRTMGP_BUILD} -j 4 || exit 1
#
cd ${RRTMGP_ROOT}/examples/rfmip-clear-sky || exit 1
export FCFLAGS+=" -I${RRTMGP_BUILD} -I${NFHOME}/include"
export LDFLAGS+=" -L${RRTMGP_BUILD} -L${NFHOME}/lib -L${NCHOME}/lib -lrte -lrrtmgp -lnetcdff -lnetcdf"
export LDFLAGS+=" -L${RRTMGP_BUILD} -L${NFHOME}/lib -lrte -lrrtmgp -lnetcdff"
make clean || exit 1
make -j 4 || exit 1
python ./stage_files.py
Expand Down
Loading