Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added softmax!. reduced softmax's memory usage. #247

Merged
merged 7 commits into from
Dec 21, 2020
Merged

added softmax!. reduced softmax's memory usage. #247

merged 7 commits into from
Dec 21, 2020

Conversation

norci
Copy link
Contributor

@norci norci commented Dec 16, 2020

  1. What about add a inplace version of softmax, with dims parameter?
  2. softmax is updated too, this can reduce memory usage by 40%.

note:
The memory uses of softmax! is the same as the old softmax.
The performance does not change.

@norci norci changed the title added a inplace softmax, with dims support. added softmax!. reduced softmax's memory usage. Dec 16, 2020
@CarloLucibello
Copy link
Member

I think both changes are ok.
Since you are at it, would you mind taking care also of #249?
To make it easier, we could probably replace all those explicit loops with functional/vectorized code,
although we have to make sure it won't impact performances.

1. added dims param in softmax! and similar functions.
fixes #249

2. refactor code.
@norci
Copy link
Contributor Author

norci commented Dec 19, 2020

I'm glad to help.

I have updated all the function, and refactored the test cases by removing duplicate code.

The performance has increased a little, the usage of memory has reduced a lot.

benchmark code

using BenchmarkTools, NNlib
x = rand(1000,1000,16);
Δ = rand(1000,1000,16);
macro mb(expr)
    :(median(@benchmark $expr samples=9 evals=1))
end
@mb softmax(x)
@mb logsoftmax(x)
@mb ∇softmax(Δ, x)
@mb ∇logsoftmax(Δ, x)

benchmark results

julia> @mb softmax(x)

this

  time:             166.028 ms
  gctime:           0.000 ns (0.00%)
  memory:           122.31 MiB
  allocs:           9

original

  time:             174.805 ms
  gctime:           3.930 ms (2.25%)
  memory:           244.39 MiB
  allocs:           11

julia> @mb logsoftmax(x)

this

  time:             167.900 ms
  gctime:           4.940 ms (2.94%)
  memory:           244.51 MiB
  allocs:           13

original

  time:             184.529 ms
  gctime:           3.947 ms (2.14%)
  memory:           244.51 MiB
  allocs:           13

julia> @mb ∇softmax(Δ, x)

this

  time:             213.308 ms
  gctime:           7.133 ms (3.34%)
  memory:           244.51 MiB
  allocs:           13

original

  time:             228.071 ms
  gctime:           14.638 ms (6.42%)
  memory:           488.65 MiB
  allocs:           17

julia> @mb ∇logsoftmax(Δ, x)

this

  time:             185.306 ms
  gctime:           0.000 ns (0.00%)
  memory:           122.44 MiB
  allocs:           11

original

  time:             217.055 ms
  gctime:           11.068 ms (5.10%)
  memory:           366.58 MiB
  allocs:           15

src/softmax.jl Outdated
max_ = maximum(xs, dims=dims)
exp_ = exp.(xs .- max_)
exp_ ./ sum(exp_, dims=dims)
softmax(x; dims = 1) = softmax!(similar(x), x; dims)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we still support julia < 1.5, so you will have to explicitly pass the keyword value

Copy link
Member

@CarloLucibello CarloLucibello Dec 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, this pattern will error on integer array inputs, that we still want to support, so we should promote to float types.

similar(x, float(eltype(x)))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

src/softmax.jl Outdated
exp_ ./ sum(exp_, dims=dims)
softmax(x; dims = 1) = softmax!(similar(x), x; dims)
softmax!(x; dims = 1) = softmax!(x, x; dims)
function softmax!(out::T, x::T; dims = 1) where {T<:AbstractArray}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can relax the constraint of out and x being of the same type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the constraint for softmax! logsoftmax!.
For ∇softmax , I guess Δ and out should be the same type. Right?

I added only one test case for integer input. is this OK?

@test softmax(Int[0, 0]) == [0.5, 0.5]

@CarloLucibello
Copy link
Member

very nice. After addressing the comments, it will be convenient to add some tests with integer inputs

@CarloLucibello
Copy link
Member

CarloLucibello commented Dec 19, 2020

I just noticed #248, I'm not sure how to handle it, but thought it was worth mentioning here

@CarloLucibello
Copy link
Member

better do #248 in a separate PR where we also bump the minor version

@CarloLucibello
Copy link
Member

You could add a test for #249 and we should be ready

@norci
Copy link
Contributor Author

norci commented Dec 20, 2020

You could add a test for #249 and we should be ready

Done.

@CarloLucibello CarloLucibello merged commit bd63541 into FluxML:master Dec 21, 2020
@norci norci deleted the inplace_softmax branch December 21, 2020 03:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants