-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Float32
to Float64
conversions in RRTMGP Interface
#2635
Comments
Unfortunately, a bunch of simulations are not stable at Float32: https://buildkite.com/clima/climaatmos-ci/builds/16548. |
The problem seems to be with saturation adjustment! Is it possible to tune the tolerances/maxiter for single precision? It may not make sense to use higher precision just for the radiation component. cc: @akshaysridhar , @szy21 |
The real problem is that the simulations are not stable, not saturation adjustment (the input to SA is wrong). If you are sure the single precision change is correct, you can try reducing dt for the unstable jobs. |
Our tests (e.g. https://buildkite.com/clima/rrtmgp-ci/builds/562#018d7f62-627e-4bb4-b9f9-107f08405aa5 contains the latest build) indicate that single precision runs generate results that are pretty close to the results from double precision runs, both in relative and absolute terms! |
Ok, try reducing dt a bit then. I would be surprised if some small change in radiation results in instability though, so let's double check if you need to reduce dt a lot to make it stable. |
I noticed that too. It's a bit curious. But, as you noted, thermodynamics is already running / working in Float32, so I'm wondering if we're somehow exercising RRTMGP in ways that are not exercised in RRTMGP.jl's test suite. |
Actually, (It would be good to catch this without plotting) |
We recently noticed in the coupler that after the CA v0.20 our coarse simulations are less stable as well, especially those with allsky radiation (we thought it was a function in TD that changed, but then tested that wasn't it). |
That is probably because we change from 0 or 1 cloud fraction to partial cloud fraction. |
Why would this lead to less stable sims though? 🤔 |
Not really, just this will result in significant change in the results:) You can test it by setting |
Actually this reminds me that I should change the |
|
Nothing else should have physical changes, but the recent refactor of rrtmgp results in numerical changes (all atmos ci and longrun passed though). (Also this issue/PR should not affect you, as we are still using Float64 for radiation in the latest atmos release, in case that's confusing) |
We haven't tested coarse resolution amip run for more than 1 day though, as that's not our focus, so it could be the atmos one is not stable as well. |
It seems there are some edge cases in SW fluxes that are wrong with Float32. When I run
So something is clearly wrong. Maybe it has something to do with some threshold in RRTMGP? @sriharshakandala any idea? |
Maybe @tapios knows if any threshold in RRTMGP could lead to errors in SW fluxes? The problem seems to happen when cos_zenith_angle is very small. In ClimaAtmos we limit the zenith angle by |
Ok, I don't think it has anything to do with clear-sky or all-sky. When I run single column clear-sky locally with time-varying insolation with Float32 it breaks too. It seems to me it has something to do with errors in SW fluxes when zenith angle is large. Float32 leads to very wrong SW fluxes. (When zenith angle is larger than 90 degree, cos_zenith_angle is It's not only the zenith angle though. When we start with large zenith angle, the simulation doesn't break immediately, so it has something to do with the atmospheric state as well. |
Don't we have some catch that when Can you please point me to the relevant pieces of code? |
We should add a conditional here to make sure all fluxes are zero when I am concerned that with our current This may not be the only problem, but it could be one problem. Also, the limiting of the zenith angle should not live in callbacks in the atmosphere model, but in RRTMG. Basically, we should avoid the entire loop over spectral bands when |
@tapios Thank you! I did a quick hack in ClimaAtmos to set all SW fluxes to zero when zenith angle is large and that seems to solve the problem: see 9679498 (The hard-coded numbers are to match our current -eps threshold). I haven't figured out which lines in RRTMGP cause the issue ( |
Also this doesn't show up in RRTMGP test cases. I think we have test cases with large zenith angle, but maybe the problem is caused by some combination of optical depth and zenith angle. Is there a way to test it in RRTMGP so we won't have similar issues in the future? Maybe we can add these edge cases we found in climaatmos to rrtmgp test cases? |
In |
It might be this line. I wonder besides setting SW fluxes to zero, should we add a minimum value for the denominator |
I did some more tests on this. With Float32 there are issues even with |
I'm fine with having |
It is dependent on SW optical depth, as for the same zenith angle sometimes it breaks and sometimes it doesn't. I will see if I can try modifying the SW optical properties. |
We found that it's because this line can be negative with both Float32 and Float64, but it's small enough and didn't cause issues with Float64. Limiting it to zero works fine in standalone RRTMGP cases. We will test ClimaAtmos with this fix in RRTMGP. |
Thanks @szy21 . I just added this change in PR CliMA/RRTMGP.jl#448 |
Remove
Float32
toFloat64
conversions and back inRRTMGP Interface
This lets us utilize faster
RRTMGP
single precision calculations forFloat32
runs.Preliminary
RRTMGP.jl
timing tests indicate about 27% reduction in time vs theFloat64
runs.There could be additional savings from eliminating conversions
The text was updated successfully, but these errors were encountered: