-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU DYAMOND run breaks down in RRTMGP Interface callbacks #2314
Comments
cc: @simonbyrne , @charleskawczynski , @tapios |
Looks like the error is that the interpolation objects used by Insolation.jl are not isbits: I had a quick look at Insolation.jl, it seems that there might be a bit of work required to make it GPU compatbile. This might be a project on its own. |
Actually, it might just need to be refactored a bit. The function we're calling is |
Yep, this is kind of like the interpolation functionality. We can probably write an easy work around |
Is there a reason we need an inner constructor here? https://github.com/CliMA/Insolation.jl/blob/f2ba81ee1e25348951f759e022145ecc8aa20c85/src/Insolation.jl#L27 |
The calculation of zenith angle involves two separate pieces.
If we take orbital parameters to be fixed (which is what we should do), no interpolations are required. The calculation in 2) uses auxiliary angles that appear as arguments in trigonometric functions:
These are all simple trig function evaluations. It's probably easiest to re-evaluate hour angle and declination angle every time they are needed (rather than doing global calculations and transferring data). In short, just call insolation with fixed orbital parameters, and the rest is a pointwise function evaluation that should be straightforward on GPUs. |
|
I tested |
It runs to completion on the GPU pipeline here: https://buildkite.com/clima/climaatmos-target-gpu-simulations/builds/138#018bad4c-6dfa-42fa-89fc-48292ca85d8b |
@akshaysridhar Could you check if the TOA shortwave flux in the above pipeline looks correct (it only has hdf5 files)? Thanks! |
Build ID: GPU-pipeline 138 |
Hm, this is running at less than 1 sypd: @sriharshakandala, @simonbyrne, have you seen this error before? My nsight systems is up to date, but the error does seem version related. Do I need to downgrade for some reason? |
Yes. We noticed this before. This had to do with the latest |
Looks good, thanks @akshaysridhar! |
Easiest fix is to |
This is apparently not fixed in the latest update. See build here: https://buildkite.com/clima/climaatmos-ci/builds/16188#018d46aa-43db-4dcb-a2d6-9a20c87d6cf9/163-244 |
This was fixed in a recent RRTMGP update, and the test is now strictly enforced |
Currently, the GPU aquaplanet DYAMOND configuration simulation is breaking down in RRTMGP callbacks.
The full error message on A100s can be found here: https://buildkite.com/clima/climaatmos-target-gpu-simulations/builds/125#018b7207-7045-43ab-b6d7-d51001a69e74/138-223
The error seems to be happening at a call to
instantaneous_zenith_angle
function here: https://github.com/CliMA/ClimaAtmos.jl/blob/main/src/callbacks/callbacks.jl#L118The text was updated successfully, but these errors were encountered: