Removes binary operation from Smagorinsky diffusivity calculation #2904

tomchor · 2023-02-06T12:41:24Z

This PR changes the calculation of the diffusivities in the SmagorinskyLilly so that binary operations are avoided. At the moment it does so by creating a calc_nonlinear_κᶜᶜᶜ() kernel, similar to what is done for the AMD closure.

Closes #2869

glwagner · 2023-02-06T15:10:53Z

There's no need for a new kernel because the diffusivities are νₑ / Pr so they can be calculated on the fly.

We need to extend κᶠᶜᶜ and κᶜᶠᶜ and κᶜᶜᶠ

@inline κᶠᶜᶜ(i, j, k, grid, closure::SmagorinskyLilly, K, ::Val{id}, args...) = ℑxᶠᵃᵃ(i, j, k, grid, K.νₑ) / closure.Pr[id]
@inline κᶜᶠᶜ(i, j, k, grid, closure::SmagorinskyLilly, K, ::Val{id}, args...) = ℑyᵃᶠᵃ(i, j, k, grid, K.νₑ) / closure.Pr[id]
@inline κᶜᶜᶠ(i, j, k, grid, closure::SmagorinskyLilly, K, ::Val{id}, args...) = ℑzᵃᵃᶠ(i, j, k, grid, K.νₑ) / closure.Pr[id]

tomchor · 2023-02-06T17:25:30Z

There's no need for a new kernel because the diffusivities are νₑ / Pr so they can be calculated on the fly.

We need to extend κᶠᶜᶜ and κᶜᶠᶜ and κᶜᶜᶠ

@inline κᶠᶜᶜ(i, j, k, grid, closure::SmagorinskyLilly, K, ::Val{id}, args...) = ℑxᶠᵃᵃ(i, j, k, grid, K.νₑ) / closure.Pr[id]
@inline κᶜᶠᶜ(i, j, k, grid, closure::SmagorinskyLilly, K, ::Val{id}, args...) = ℑyᵃᶠᵃ(i, j, k, grid, K.νₑ) / closure.Pr[id]
@inline κᶜᶜᶠ(i, j, k, grid, closure::SmagorinskyLilly, K, ::Val{id}, args...) = ℑzᵃᵃᶠ(i, j, k, grid, K.νₑ) / closure.Pr[id]

I may be missing something, but when I make those changes I still get the errors in #2869. The changes in this PR are the only ones I've tried that seem to solve the issue.

I've implemented and pushed your solution to the tc/smag-binary-op2 branch in case you wanna check if I understood your suggestion. Basically these are the changes:

Oceananigans.jl/src/TurbulenceClosures/turbulence_closure_implementations/smagorinsky_lilly.jl

Lines 127 to 129 in 56a76ae

    
           @inline κᶠᶜᶜ(i, j, k, grid, closure::SmagorinskyLilly, K, ::Val{id}, args...) where id = ℑxᶠᵃᵃ(i, j, k, grid, K.νₑ) / closure.Pr[id] 
        
           @inline κᶜᶠᶜ(i, j, k, grid, closure::SmagorinskyLilly, K, ::Val{id}, args...) where id = ℑyᵃᶠᵃ(i, j, k, grid, K.νₑ) / closure.Pr[id] 
        
           @inline κᶜᶜᶠ(i, j, k, grid, closure::SmagorinskyLilly, K, ::Val{id}, args...) where id = ℑzᵃᵃᶠ(i, j, k, grid, K.νₑ) / closure.Pr[id]

I'm getting a device kernel image is invalid (code 200, ERROR_INVALID_IMAGE) error with only those changes.

glwagner · 2023-02-06T17:49:28Z

On that branch you are still forming the BinaryOperation which is the problem. You also need to remove the BinaryOperation. If you open a PR we can work on it?

glwagner · 2023-02-06T17:50:29Z

Change

Oceananigans.jl/src/TurbulenceClosures/turbulence_closure_implementations/smagorinsky_lilly.jl

Lines 212 to 224 in 56a76ae

    
           # Use AbstractOperations to write eddy diffusivities in terms of 
        
           # eddy viscosity 
        
           κₑ_ops = [] 
        
           for i = 1:length(tracer_names) 
        
               Pr = closure.Pr[i] 
        
               κₑ_op = νₑ / Pr 
        
               push!(κₑ_ops, κₑ_op) 
        
           end 
        
           κₑ = NamedTuple{tracer_names}(Tuple(κₑ_ops)) 
        
           return (; νₑ, κₑ)

to

return (; νₑ)

tomchor · 2023-02-06T18:06:54Z

On that branch you are still forming the BinaryOperation which is the problem. You also need to remove the BinaryOperation. If you open a PR we can work on it?

Sure, I'll open a new PR soon then!

But just to be clear, I totally get that the problem is that I'm still passing the binary operations. I kept them there because (if I understan correctly) if don't pass $\kappa_e$ in DiffusivityFields() then model.diffusivity_fields won't get populated, right? Or is your point that we don't have to populate model.diffusivity_fields for Smagorinsky?

glwagner · 2023-02-06T18:59:46Z

The change I suggested will still put the eddy viscosity into diffusivity_fields, and removes the eddy diffusivities (calculating them on the fly from Pr and eddy viscosity)

glwagner · 2023-02-06T19:07:24Z

You can also redefine the user-facing function diffusivity to return the original BinaryOperation (which hopefully will not enter into kernels because of the extension of those three “getter” methods)

tomchor · 2023-02-06T20:39:48Z

The change I suggested will still put the eddy viscosity into diffusivity_fields, and removes the eddy diffusivities (calculating them on the fly from Pr and eddy viscosity)

Yes! Sorry I wasn't clear enough. I got that. I meant populate it with the diffuvisities specifically, so we're on the same page.

tomchor · 2023-02-06T21:09:57Z

Sorry for going slow on this, but if I understand correctly what you're proposing is:

Remove the calculation of diffusivities from DiffusivityFields() (and thus remove tracer diffusivities from model.diffusivity_fields)
Specialize κᶠᶜᶜ() functions for SmagoriknskyLilly so that diffusivities are calculated on the fly

As opposed to what I'm doing here which is just to change the calculation of diffusivities in DiffusivityFields() from using a BinaryOperation to a kernel (keeping them in model.diffusivity_fields).

If I understand correctly both methods do the same number of operations (one calculation of $\nu_e$, a division by Pr and one interpolation for each face of each grid cell), no?

So I guess the advantage of what you're proposing is that it saves memory (since diffusivities are calculated on the fly), at the cost of a bit more code complexity (i.e., one more specialization).

Conversely, the direction this PR is going atm uses more memory (for the diffusivities) but in my opinion the code is a bit clearer, since there's one fewer specialization (i.e. κᶠᶜᶜ() remains the same) and the code in smagorinsky_lilly.jl looks more like the code in anisotropic_minimum_dissipation.jl, which makes things more standardized.

I'll defer to you either way, but I vote that we take the approach that this PR is currently doing since, as we discussed before in a few PRs, the code in TurbulenceClosures is already a bit on the complex side and not super easy to understand. So I think the standardization of having smagorinsky_lilly.jl have the same structure as anisotropic_minimum_dissipation.jl I'd argue is a benefit.

If we follow with this PR, the increase in memory should be around 15% for one tracer and less for more tracers, so relatively small, plus the memory limitation is about to become less severe since hopefully #2795 will be merged soon?

@glwagner like I said I'll defer to you either way. So if you prefer your original suggestion I'll close this PR and open another with your suggested modifications :)

glwagner · 2023-02-06T21:25:08Z

I think extending κᶠᶜᶜ is simpler --- less code and less memory usage.

glwagner · 2023-02-06T21:26:40Z

I also don't like calc_nonlinear_κᶜᶜᶜ. I hope we can come up with a better interface for defining LES closures in the future.

glwagner · 2023-02-06T21:34:15Z

The memory savings is a major advantage of this closure over AnisotropicMinimumDissipation for problems with very large numbers of tracers (eg biogeochemistry problems with 10+ tracers).

Note also that there is overhead to launching a kernel which cannot be ignored --- we can't estimate computational cost just by adding the number of operations. Typically (though not always) our goal is to reduce the number of kernel launches as much as possible.

tomchor · 2023-02-06T22:04:16Z

The memory savings is a major advantage of this closure over AnisotropicMinimumDissipation for problems with very large numbers of tracers (eg biogeochemistry problems with 10+ tracers).

Note also that there is overhead to launching a kernel which cannot be ignored --- we can't estimate computational cost just by adding the number of operations. Typically (though not always) our goal is to reduce the number of kernel launches as much as possible.

Good points. I also wasn't aware that the kernel launch time was an important issue. I'll close this PR and open another one extending κᶠᶜᶜ().

calculate smag diffusivities in a kernel

53a421a

tomchor closed this Feb 6, 2023

tomchor deleted the tc/smag-binary-op branch February 6, 2023 22:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removes binary operation from Smagorinsky diffusivity calculation #2904

Removes binary operation from Smagorinsky diffusivity calculation #2904

tomchor commented Feb 6, 2023

glwagner commented Feb 6, 2023 •

edited

Loading

tomchor commented Feb 6, 2023 •

edited

Loading

glwagner commented Feb 6, 2023

glwagner commented Feb 6, 2023

tomchor commented Feb 6, 2023 •

edited

Loading

glwagner commented Feb 6, 2023

glwagner commented Feb 6, 2023

tomchor commented Feb 6, 2023

tomchor commented Feb 6, 2023

glwagner commented Feb 6, 2023

glwagner commented Feb 6, 2023

glwagner commented Feb 6, 2023 •

edited

Loading

tomchor commented Feb 6, 2023

Removes binary operation from Smagorinsky diffusivity calculation #2904

Removes binary operation from Smagorinsky diffusivity calculation #2904

Conversation

tomchor commented Feb 6, 2023

glwagner commented Feb 6, 2023 • edited Loading

tomchor commented Feb 6, 2023 • edited Loading

glwagner commented Feb 6, 2023

glwagner commented Feb 6, 2023

tomchor commented Feb 6, 2023 • edited Loading

glwagner commented Feb 6, 2023

glwagner commented Feb 6, 2023

tomchor commented Feb 6, 2023

tomchor commented Feb 6, 2023

glwagner commented Feb 6, 2023

glwagner commented Feb 6, 2023

glwagner commented Feb 6, 2023 • edited Loading

tomchor commented Feb 6, 2023

glwagner commented Feb 6, 2023 •

edited

Loading

tomchor commented Feb 6, 2023 •

edited

Loading

tomchor commented Feb 6, 2023 •

edited

Loading

glwagner commented Feb 6, 2023 •

edited

Loading