-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removes binary operation from Smagorinsky diffusivity calculation #2904
Conversation
There's no need for a new kernel because the diffusivities are We need to extend @inline κᶠᶜᶜ(i, j, k, grid, closure::SmagorinskyLilly, K, ::Val{id}, args...) = ℑxᶠᵃᵃ(i, j, k, grid, K.νₑ) / closure.Pr[id]
@inline κᶜᶠᶜ(i, j, k, grid, closure::SmagorinskyLilly, K, ::Val{id}, args...) = ℑyᵃᶠᵃ(i, j, k, grid, K.νₑ) / closure.Pr[id]
@inline κᶜᶜᶠ(i, j, k, grid, closure::SmagorinskyLilly, K, ::Val{id}, args...) = ℑzᵃᵃᶠ(i, j, k, grid, K.νₑ) / closure.Pr[id] |
I may be missing something, but when I make those changes I still get the errors in #2869. The changes in this PR are the only ones I've tried that seem to solve the issue. I've implemented and pushed your solution to the tc/smag-binary-op2 branch in case you wanna check if I understood your suggestion. Basically these are the changes: Oceananigans.jl/src/TurbulenceClosures/turbulence_closure_implementations/smagorinsky_lilly.jl Lines 127 to 129 in 56a76ae
I'm getting a |
On that branch you are still forming the |
Change Oceananigans.jl/src/TurbulenceClosures/turbulence_closure_implementations/smagorinsky_lilly.jl Lines 212 to 224 in 56a76ae
to return (; νₑ) |
Sure, I'll open a new PR soon then! But just to be clear, I totally get that the problem is that I'm still passing the binary operations. I kept them there because (if I understan correctly) if don't pass |
The change I suggested will still put the eddy viscosity into diffusivity_fields, and removes the eddy diffusivities (calculating them on the fly from Pr and eddy viscosity) |
You can also redefine the user-facing function diffusivity to return the original BinaryOperation (which hopefully will not enter into kernels because of the extension of those three “getter” methods) |
Yes! Sorry I wasn't clear enough. I got that. I meant populate it with the diffuvisities specifically, so we're on the same page. |
Sorry for going slow on this, but if I understand correctly what you're proposing is:
As opposed to what I'm doing here which is just to change the calculation of diffusivities in If I understand correctly both methods do the same number of operations (one calculation of So I guess the advantage of what you're proposing is that it saves memory (since diffusivities are calculated on the fly), at the cost of a bit more code complexity (i.e., one more specialization). Conversely, the direction this PR is going atm uses more memory (for the diffusivities) but in my opinion the code is a bit clearer, since there's one fewer specialization (i.e. I'll defer to you either way, but I vote that we take the approach that this PR is currently doing since, as we discussed before in a few PRs, the code in If we follow with this PR, the increase in memory should be around 15% for one tracer and less for more tracers, so relatively small, plus the memory limitation is about to become less severe since hopefully #2795 will be merged soon? @glwagner like I said I'll defer to you either way. So if you prefer your original suggestion I'll close this PR and open another with your suggested modifications :) |
I think extending |
I also don't like |
The memory savings is a major advantage of this closure over Note also that there is overhead to launching a kernel which cannot be ignored --- we can't estimate computational cost just by adding the number of operations. Typically (though not always) our goal is to reduce the number of kernel launches as much as possible. |
Good points. I also wasn't aware that the kernel launch time was an important issue. I'll close this PR and open another one extending |
This PR changes the calculation of the diffusivities in the
SmagorinskyLilly
so that binary operations are avoided. At the moment it does so by creating acalc_nonlinear_κᶜᶜᶜ()
kernel, similar to what is done for the AMD closure.Closes #2869