Skip to content
This repository has been archived by the owner on Jul 19, 2023. It is now read-only.

Unexpectedly many allocations for in place operations #348

Open
atiyo opened this issue Mar 24, 2021 · 4 comments
Open

Unexpectedly many allocations for in place operations #348

atiyo opened this issue Mar 24, 2021 · 4 comments

Comments

@atiyo
Copy link

atiyo commented Mar 24, 2021

Firstly, thank you. I'm really enjoying the SciML ecosystem.

I'm noticing an unexpectedly high number of allocations when applying operators in place. Here's a minimal example:

d2x = CenteredDifference{1}(2, 2, 1.0, 5)
bcx = Neumann0BC(1.0, 1)
op = d2x * bcx
u = zeros(5,5)
u[3,3] = 1.0
@btime du = op * u # 99.501 μs (1155 allocations: 95.17 KiB)
du = zeros(5,5)
@btime mul!(du, op, u) # 97.665 μs (1154 allocations: 94.89 KiB)

I was expecting a bigger reduction in allocations during the in-place application.

@ChrisRackauckas
Copy link
Member

yes, that looks bad.

@hurricane007
Copy link

hurricane007 commented Jul 7, 2021

I confirm this problem is still there.
And it's very strange that it takes such long time for such small, 5 * 5, matrix. And surprisingly, when you do this operation on only one column, the difference is so dramatic. I used a 100*100 matrix.

julia> @btime du = op * u[:,1];
  733.333 ns (5 allocations: 1.95 KiB)

julia> @btime du = op * u;
  23.092 ms (481702 allocations: 31.38 MiB)

there is one allocation can be reduced by:

julia> dux = @view du[:,1]
julia> @btime mul!(dux, op, u[:,1])
  721.519 ns (4 allocations: 1.08 KiB)

Then I define

function f1!(du, u)
    m,n = size(u)
    @assert (m, n) == size(du)
    for i = 1:n
        du[:, i] = op * u[:,i]
    end
end

This is much faster than do it directly.

julia> @benchmark f1!(du, u)
BenchmarkTools.Trial:
  memory estimate:  195.31 KiB
  allocs estimate:  500
  --------------
  minimum time:     63.400 μs (0.00% GC)
  median time:      137.200 μs (0.00% GC)
  mean time:        1.051 ms (87.11% GC)
  maximum time:     680.927 ms (99.98% GC)
  --------------
  samples:          5230
  evals/sample:     1

julia> @benchmark mul!(du, op, u)
BenchmarkTools.Trial:
  memory estimate:  31.30 MiB
  allocs estimate:  481700
  --------------
  minimum time:     22.251 ms (0.00% GC)
  median time:      28.158 ms (0.00% GC)
  mean time:        33.626 ms (17.35% GC)
  maximum time:     70.771 ms (0.00% GC)
  --------------
  samples:          149
  evals/sample:     1

But it seems not possible to define mul!(du[:,i], op, u[:,i]) inside the function. So i have to define something like :

function f2!(du, u)
    m,n = size(u)
    @assert (m, n) == size(du)
    dui = zeros(m)
    for i = 1:n
        mul!(dui, op, u[:,i])
        @. du[:, i] = dui
    end
end

and results in some improvement:

julia> @benchmark f2!(du, u)
BenchmarkTools.Trial:
  memory estimate:  107.12 KiB
  allocs estimate:  301
  --------------
  minimum time:     48.500 μs (0.00% GC)
  median time:      80.600 μs (0.00% GC)
  mean time:        440.562 μs (81.42% GC)
  maximum time:     421.137 ms (99.97% GC)
  --------------
  samples:          10000
  evals/sample:     1

@ChrisRackauckas
Copy link
Member

It may be a threading performance issue. We may need to use @turbo or whatnot.

@hurricane007
Copy link

It may be a threading performance issue. We may need to use @turbo or whatnot.

Hi Chris, did you notice the difference in allocations between operating the whole matrix or only one column?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants