Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

O1.2.1 Make ClimaAtmos GPU compatible #1980

Closed
29 of 38 tasks
sriharshakandala opened this issue Aug 15, 2023 · 5 comments
Closed
29 of 38 tasks

O1.2.1 Make ClimaAtmos GPU compatible #1980

sriharshakandala opened this issue Aug 15, 2023 · 5 comments
Assignees
Labels
SDI Software Design Issue

Comments

@sriharshakandala
Copy link
Member

sriharshakandala commented Aug 15, 2023

The Climate Modeling Alliance

Software Design Issue 📜

Purpose

Make ClimaAtmos.jl package GPU compatible.

Cost/Benefits/Risks

People and Personnel

Components

Inputs

Results and Deliverables

SDI Revision Log

CC

@simonbyrne @cmbengue @tapios

RRTMGP.jl

  1. sriharshakandala

SurfaceFluxes.jl

  1. GPU SDI 🏅
    akshaysridhar sriharshakandala
  2. akshaysridhar
  3. akshaysridhar

CloudMicroPhysics.jl

  1. GPU SDI
    sriharshakandala trontrytel
  2. GPU tests
    nefrathenrici
  3. GPU tests
  4. GPU

Diagnostic EDMF

No tasks being tracked yet.

Insolation

@sriharshakandala sriharshakandala added the SDI Software Design Issue label Aug 15, 2023
@szy21
Copy link
Member

szy21 commented Aug 17, 2023

Looks good to me, thanks Sriharsha! Maybe we should move Supporting single precision calculations for RRTMGP to #1995, if that would be an enhancement for performance?

As a side note, Insolation used to only work with Float64. It seems that problem has been solved, but we still use Float64 in ClimaAtmos here. We need to change it and check if it still works.

@sriharshakandala
Copy link
Member Author

#1995

I am flexible on that. The reason for placing this here is that Float32 has been set as a default precision in ClimaAtmos.

@szy21
Copy link
Member

szy21 commented Aug 17, 2023

Yes, when we use Float32 in ClimaAtmos it is converted to Float64 when calculating radiation. I forgot where this is done, perhaps inside RRTMGP? Anyway, if this task doesn't take long, I think it's ok to have it here.

@charleskawczynski
Copy link
Member

charleskawczynski commented Oct 3, 2023

The next thing that is breaking in the moist GPU case (after #2180 is merged) is this line:

    @. ᶠρK_E[colidx] =
        ᶠinterp(Y.c.ρ[colidx]) * eddy_diffusivity_coefficient(
            C_E,
            norm(interior_uₕ[colidx]),
            interior_coordinates.z[colidx] - z_surface,
            ᶠp[colidx],
        )

I imagine that the issue is one of 3 things:

  • norm may need to be hoisted
  • surface-sphere broadcasting is not working on the GPU (need to double check the status in ClimaCore)
  • broadcasting of first interior (center space) and surface (face space) is not supported.

The first is easiest to fix but seems unlikely. I think we fixed the second one. The third is a bit problematic, tbh. Actually, I think in this particular case, we can just use ᶜΔz / 2 at the surface.

@charleskawczynski
Copy link
Member

I think #2190 fixes the issue with

    @. ᶠρK_E[colidx] =
        ᶠinterp(Y.c.ρ[colidx]) * eddy_diffusivity_coefficient(
            C_E,
            norm(interior_uₕ[colidx]),
            interior_coordinates.z[colidx] - z_surface,
            ᶠp[colidx],
        )

above. The next breaking change, seen in build https://buildkite.com/clima/climaatmos-ci/builds/13566#018afcbf-a78e-49da-b017-18d0c705c4fa is:

ᶜdivᵥ_ρe_tot = Operators.DivergenceF2C(
top = Operators.SetValue(C3(FT(0))),
bottom = Operators.SetValue(sfc_conditions.ρ_flux_h_tot[colidx]),
)
@. Yₜ.c.ρe_tot[colidx] -=
ᶜdivᵥ_ρe_tot(-(ᶠρK_E[colidx] * ᶠgradᵥ(ᶜh_tot[colidx])))

and the error is:

ERROR: LoadError: GPU compilation of MethodInstance for copyto_stencil_kernel!(::Field{} (trunc disp), ::Broadcasted{CUDAColumnStencilStyle, PlaceholderSpace, typeof(rsub), Tuple{Field{} (trunc disp), StencilBroadcasted{CUDAColumnStencilStyle, DivergenceF2C{NamedTuple{(:top, :bottom), Tuple{SetValue{Covariant3Vector{Float32}}, SetValue{Field{} (trunc disp)}}}}, Tuple{Broadcasted{CUDAColumnStencilStyle, FacePlaceholderSpace, typeof(rsub), Tuple{Broadcasted{CUDAColumnStencilStyle, PlaceholderSpace, typeof(rmul), Tuple{Field{} (trunc disp), StencilBroadcasted{CUDAColumnStencilStyle, GradientC2F{NamedTuple{(), Tuple{}}}, Tuple{Field{} (trunc disp)}, PlaceholderSpace}}}}}}, PlaceholderSpace}}}, ::ExtrudedFiniteDifferenceSpace{CellCenter, SpectralElementSpace2D{Nothing, Quadratures.GLL{4}, SphericalGlobalGeometry{Float32}, IJFH{LocalGeometry{(1, 2), LatLongPoint{Float32}, Float32, SMatrix{2, 2, Float32, 4}}, 4, CuDeviceArray{Float32, 4, 1}}, IJFH{Float32, 4, CuDeviceArray{Float32, 4, 1}}, IFH{SurfaceGeometry{Float32, UVVector{Float32}}, 4, CuDeviceArray{Float32, 3, 1}}, NamedTuple{(), Tuple{}}}, Topologies.DeviceIntervalTopology{NamedTuple{(:bottom, :top), Tuple{Int64, Int64}}}, Flat, SphericalGlobalGeometry{Float32}, VIJFH{LocalGeometry{(1, 2, 3), LatLongZPoint{Float32}, Float32, SMatrix{3, 3, Float32, 9}}, 4, CuDeviceArray{Float32, 5, 1}}, VIJFH{LocalGeometry{(1, 2, 3), LatLongZPoint{Float32}, Float32, SMatrix{3, 3, Float32, 9}}, 4, CuDeviceArray{Float32, 5, 1}}}, ::NTuple{4, Int64}, ::Int64, ::Int64, ::Int64) failed
KernelError: passing and using non-bitstype argument
 
Argument 3 to your kernel function is of type Broadcasted{CUDAColumnStencilStyle, PlaceholderSpace, typeof(rsub), Tuple{Field{} (trunc disp), StencilBroadcasted{CUDAColumnStencilStyle, DivergenceF2C{NamedTuple{(:top, :bottom), Tuple{SetValue{Covariant3Vector{Float32}}, SetValue{Field{} (trunc disp)}}}}, Tuple{Broadcasted{CUDAColumnStencilStyle, FacePlaceholderSpace, typeof(rsub), Tuple{Broadcasted{CUDAColumnStencilStyle, PlaceholderSpace, typeof(rmul), Tuple{Field{} (trunc disp), StencilBroadcasted{CUDAColumnStencilStyle, GradientC2F{NamedTuple{(), Tuple{}}}, Tuple{Field{} (trunc disp)}, PlaceholderSpace}}}}}}, PlaceholderSpace}}}, which is not isbits:
  .args is of type Tuple{Field{} (trunc disp), StencilBroadcasted{CUDAColumnStencilStyle, DivergenceF2C{NamedTuple{(:top, :bottom), Tuple{SetValue{Covariant3Vector{Float32}}, SetValue{Field{} (trunc disp)}}}}, Tuple{Broadcasted{CUDAColumnStencilStyle, FacePlaceholderSpace, typeof(rsub), Tuple{Broadcasted{CUDAColumnStencilStyle, PlaceholderSpace, typeof(rmul), Tuple{Field{} (trunc disp), StencilBroadcasted{CUDAColumnStencilStyle, GradientC2F{NamedTuple{(), Tuple{}}}, Tuple{Field{} (trunc disp)}, PlaceholderSpace}}}}}}, PlaceholderSpace}} which is not isbits.
    .2 is of type StencilBroadcasted{CUDAColumnStencilStyle, DivergenceF2C{NamedTuple{(:top, :bottom), Tuple{SetValue{Covariant3Vector{Float32}}, SetValue{Field{} (trunc disp)}}}}, Tuple{Broadcasted{CUDAColumnStencilStyle, FacePlaceholderSpace, typeof(rsub), Tuple{Broadcasted{CUDAColumnStencilStyle, PlaceholderSpace, typeof(rmul), Tuple{Field{} (trunc disp), StencilBroadcasted{CUDAColumnStencilStyle, GradientC2F{NamedTuple{(), Tuple{}}}, Tuple{Field{} (trunc disp)}, PlaceholderSpace}}}}}}, PlaceholderSpace} which is not isbits.
      .op is of type DivergenceF2C{NamedTuple{(:top, :bottom), Tuple{SetValue{Covariant3Vector{Float32}}, SetValue{Field{} (trunc disp)}}}} which is not isbits.
        .bcs is of type NamedTuple{(:top, :bottom), Tuple{SetValue{Covariant3Vector{Float32}}, SetValue{Field{} (trunc disp)}}} which is not isbits.
          .bottom is of type SetValue{Field{} (trunc disp)} which is not isbits.
            .val is of type Field{} (trunc disp) which is not isbits.
              .values is of type IJFH{Covariant3Vector{Float32}, 4, SubArray{Float32, 4, CuArray{Float32, 4, Mem.DeviceBuffer}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}, Base.Slice{Base.OneTo{Int64}}}, false}} which is not isbits.
                .array is of type SubArray{Float32, 4, CuArray{Float32, 4, Mem.DeviceBuffer}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}, Base.Slice{Base.OneTo{Int64}}}, false} which is not isbits.
                  .parent is of type CuArray{Float32, 4, Mem.DeviceBuffer} which is not isbits.
                    .storage is of type Union{Nothing, ArrayStorage{Mem.DeviceBuffer}} which is not isbits.
              .space is of type SpectralElementSpace2D{Topologies.Topology2D{ClimaComms.SingletonCommsContext{ClimaComms.CUDADevice}, Meshes.EquiangularCubedSphere{Domains.SphereDomain{Float32}, Meshes.NormalizedBilinearMap}, Vector{CartesianIndex{3}}, Array{Int64, 3}, CuArray{Tuple{Int64, Int64, Int64, Int64, Bool}, 1, Mem.DeviceBuffer}, Vector{Tuple{Int64, Int64, Int64, Int64, Bool}}, CuArray{Tuple{Int64, Int64}, 1, Mem.DeviceBuffer}, CuArray{Int64, 1, Mem.DeviceBuffer}, CuArray{Tuple{Bool, Int64, Int64}, 1, Mem.DeviceBuffer}, CuArray{Int64, 1, Mem.DeviceBuffer}, NamedTuple{(), Tuple{}}, CuArray{Tuple{Int64, Int64}, 1, Mem.DeviceBuffer}}, Quadratures.GLL{4}, SphericalGlobalGeometry{Float32}, IJFH{LocalGeometry{(1, 2, 3), LatLongZPoint{Float32}, Float32, SMatrix{3, 3, Float32, 9}}, 4, SubArray{Float32, 4, CuArray{Float32, 5, Mem.DeviceBuffer}, Tuple{Int64, Vararg{Base.Slice{Base.OneTo{Int64}}, 4}}, true}}, IJFH{Float32, 4, CuArray{Float32, 4, Mem.DeviceBuffer}}, IFH{SurfaceGeometry{Float32, UVVector{Float32}}, 4, CuArray{Float32, 3, Mem.DeviceBuffer}}, NamedTuple{(), Tuple{}}} which is not isbits.
                .topology is of type Topologies.Topology2D{ClimaComms.SingletonCommsContext{ClimaComms.CUDADevice}, Meshes.EquiangularCubedSphere{Domains.SphereDomain{Float32}, Meshes.NormalizedBilinearMap}, Vector{CartesianIndex{3}}, Array{Int64, 3}, CuArray{Tuple{Int64, Int64, Int64, Int64, Bool}, 1, Mem.DeviceBuffer}, Vector{Tuple{Int64, Int64, Int64, Int64, Bool}}, CuArray{Tuple{Int64, Int64}, 1, Mem.DeviceBuffer}, CuArray{Int64, 1, Mem.DeviceBuffer}, CuArray{Tuple{Bool, Int64, Int64}, 1, Mem.DeviceBuffer}, CuArray{Int64, 1, Mem.DeviceBuffer}, NamedTuple{(), Tuple{}}, CuArray{Tuple{Int64, Int64}, 1, Mem.DeviceBuffer}} which is not isbits.
                  .elemorder is of type Vector{CartesianIndex{3}} which is not isbits.
                  .orderindex is of type Array{Int64, 3} which is not isbits.
                  .elempid is of type Vector{Int64} which is not isbits.
                  .neighbor_pids is of type Vector{Int64} which is not isbits.
                  .send_elem_lidx is of type Vector{Int64} which is not isbits.
                  .send_elem_lengths is of type Vector{Int64} which is not isbits.
                  .recv_elem_gidx is of type Vector{Int64} which is not isbits.
                  .recv_elem_lengths is of type Vector{Int64} which is not isbits.
                  .interior_faces is of type CuArray{Tuple{Int64, Int64, Int64, Int64, Bool}, 1, Mem.DeviceBuffer} which is not isbits.
                    .storage is of type Union{Nothing, ArrayStorage{Mem.DeviceBuffer}} which is not isbits.
                  .ghost_faces is of type Vector{Tuple{Int64, Int64, Int64, Int64, Bool}} which is not isbits.
                  .local_vertices is of type CuArray{Tuple{Int64, Int64}, 1, Mem.DeviceBuffer} which is not isbits.
                    .storage is of type Union{Nothing, ArrayStorage{Mem.DeviceBuffer}} which is not isbits.
                  .local_vertex_offset is of type CuArray{Int64, 1, Mem.DeviceBuffer} which is not isbits.
                    .storage is of type Union{Nothing, ArrayStorage{Mem.DeviceBuffer}} which is not isbits.
                  .ghost_vertices is of type CuArray{Tuple{Bool, Int64, Int64}, 1, Mem.DeviceBuffer} which is not isbits.
                    .storage is of type Union{Nothing, ArrayStorage{Mem.DeviceBuffer}} which is not isbits.
                  .ghost_vertex_offset is of type CuArray{Int64, 1, Mem.DeviceBuffer} which is not isbits.
                    .storage is of type Union{Nothing, ArrayStorage{Mem.DeviceBuffer}} which is not isbits.
                  .local_neighbor_elem is of type Vector{Int64} which is not isbits.
                  .local_neighbor_elem_offset is of type Vector{Int64} which is not isbits.
                  .ghost_neighbor_elem is of type Vector{Int64} which is not isbits.
                  .ghost_neighbor_elem_offset is of type Vector{Int64} which is not isbits.
                  .internal_elems is of type Vector{Int64} which is not isbits.
                  .perimeter_elems is of type Vector{Int64} which is not isbits.
                  .ghost_vertex_gcidx is of type Vector{Int64} which is not isbits.
                  .ghost_face_gcidx is of type Vector{Int64} which is not isbits.
                  .comm_vertex_lengths is of type Vector{Int64} which is not isbits.
                  .comm_face_lengths is of type Vector{Int64} which is not isbits.
                  .ghost_vertex_neighbor_loc is of type Vector{Int64} which is not isbits.
                  .ghost_vertex_comm_idx_offset is of type Vector{Int64} which is not isbits.
                  .repr_ghost_vertex is of type CuArray{Tuple{Int64, Int64}, 1, Mem.DeviceBuffer} which is not isbits.
                    .storage is of type Union{Nothing, ArrayStorage{Mem.DeviceBuffer}} which is not isbits.
                  .ghost_face_neighbor_loc is of type Vector{Int64} which is not isbits.
                .local_geometry is of type IJFH{LocalGeometry{(1, 2, 3), LatLongZPoint{Float32}, Float32, SMatrix{3, 3, Float32, 9}}, 4, SubArray{Float32, 4, CuArray{Float32, 5, Mem.DeviceBuffer}, Tuple{Int64, Vararg{Base.Slice{Base.OneTo{Int64}}, 4}}, true}} which is not isbits.
                  .array is of type SubArray{Float32, 4, CuArray{Float32, 5, Mem.DeviceBuffer}, Tuple{Int64, Vararg{Base.Slice{Base.OneTo{Int64}}, 4}}, true} which is not isbits.
                    .parent is of type CuArray{Float32, 5, Mem.DeviceBuffer} which is not isbits.
                .ghost_geometry is of type IJFH{LocalGeometry{(1, 2, 3), LatLongZPoint{Float32}, Float32, SMatrix{3, 3, Float32, 9}}, 4, SubArray{Float32, 4, CuArray{Float32, 5, Mem.DeviceBuffer}, Tuple{Int64, Vararg{Base.Slice{Base.OneTo{Int64}}, 4}}, true}} which is not isbits.
                  .array is of type SubArray{Float32, 4, CuArray{Float32, 5, Mem.DeviceBuffer}, Tuple{Int64, Vararg{Base.Slice{Base.OneTo{Int64}}, 4}}, true} which is not isbits.
                    .parent is of type CuArray{Float32, 5, Mem.DeviceBuffer} which is not isbits.
                .local_dss_weights is of type IJFH{Float32, 4, CuArray{Float32, 4, Mem.DeviceBuffer}} which is not isbits.
                  .array is of type CuArray{Float32, 4, Mem.DeviceBuffer} which is not isbits.
                    .storage is of type Union{Nothing, ArrayStorage{Mem.DeviceBuffer}} which is not isbits.
                .ghost_dss_weights is of type IJFH{Float32, 4, CuArray{Float32, 4, Mem.DeviceBuffer}} which is not isbits.
                  .array is of type CuArray{Float32, 4, Mem.DeviceBuffer} which is not isbits.
                    .storage is of type Union{Nothing, ArrayStorage{Mem.DeviceBuffer}} which is not isbits.
                .internal_surface_geometry is of type IFH{SurfaceGeometry{Float32, UVVector{Float32}}, 4, CuArray{Float32, 3, Mem.DeviceBuffer}} which is not isbits.
                  .array is of type CuArray{Float32, 3, Mem.DeviceBuffer} which is not isbits.
                    .storage is of type Union{Nothing, ArrayStorage{Mem.DeviceBuffer}} which is not isbits.
 
Stacktrace:
  [1] check_invocation(job::GPUCompiler.CompilerJob)
...

bors bot added a commit that referenced this issue Oct 5, 2023
2190: Try to fix moist gpu example r=charleskawczynski a=charleskawczynski

This PR fixes the next breaking piece, blocking us from #1980. Closes #2191.

Co-authored-by: Charles Kawczynski <kawczynski.charles@gmail.com>
@szy21 szy21 changed the title Make ClimaAtmos GPU compatible O1.2.1 Make ClimaAtmos GPU compatible Oct 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SDI Software Design Issue
Projects
None yet
Development

No branches or pull requests

4 participants