Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for folding RNNs over 3d arrays #1686

Merged
merged 15 commits into from
Sep 14, 2021
Merged
44 changes: 44 additions & 0 deletions src/layers/recurrent.jl
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ rnn.state = hidden(rnn.cell)
reset!(m::Recur) = (m.state = m.cell.state0)
reset!(m) = foreach(reset!, functor(m)[1])


# TODO remove in v0.13
function Base.getproperty(m::Recur, sym::Symbol)
if sym === :init
Expand All @@ -67,6 +68,45 @@ end

flip(f, xs) = reverse(f.(reverse(xs)))


"""
FoldedRecur

We fold over the second dimension as the time dimension, and return all the new hidden states concatenated on the time dimension.

TODO: Figure out how to use CuDNN for RNN, GRU, LSTM.

"""
struct FoldedRecur{C}
mkschleg marked this conversation as resolved.
Show resolved Hide resolved
cell::C
end

Flux.@functor FoldedRecur
trainable(a::FoldedRecur) = (a.cell,)

# currently RNNs are only usable with 3-d arrays.
function (m::FoldedRecur)(x::AbstractArray{<:Number, 3})

# stride across temporal dimension
h = m.cell.state0
h_all = if h isa Tuple
Vector{typeof(h[1])}(undef, size(x, 2))
else
Vector{typeof(h)}(undef, size(x, 2))
end

for t in axes(x, 2)
h, h_out = m.cell(h, x[:, t, :])
h_all[t] = h_out
end

sz = size(x)
mkschleg marked this conversation as resolved.
Show resolved Hide resolved
h_ret = cat(reshape.(h_all, :, 1, sz[3])..., dims=2)
return h_ret
mkschleg marked this conversation as resolved.
Show resolved Hide resolved

end


# Vanilla RNN

struct RNNCell{F,A,V,S}
Expand Down Expand Up @@ -102,6 +142,7 @@ The most basic recurrent layer; essentially acts as a `Dense` layer, but with th
output fed back into the input each time step.
"""
RNN(a...; ka...) = Recur(RNNCell(a...; ka...))
FoldedRNN(args...; kwargs...) = FoldedRecur(Flux.RNNCell(args...; kwargs...))
Recur(m::RNNCell) = Recur(m, m.state0)

# TODO remove in v0.13
Expand Down Expand Up @@ -162,6 +203,7 @@ See [this article](https://colah.github.io/posts/2015-08-Understanding-LSTMs/)
for a good overview of the internals.
"""
LSTM(a...; ka...) = Recur(LSTMCell(a...; ka...))
FoldedLSTM(args...; kwargs...) = FoldedRecur(Flux.LSTMCell(args...; kwargs...))
mkschleg marked this conversation as resolved.
Show resolved Hide resolved
Recur(m::LSTMCell) = Recur(m, m.state0)

# TODO remove in v0.13
Expand Down Expand Up @@ -227,6 +269,7 @@ See [this article](https://colah.github.io/posts/2015-08-Understanding-LSTMs/)
for a good overview of the internals.
"""
GRU(a...; ka...) = Recur(GRUCell(a...; ka...))
FoldedGRU(args...; kwargs...) = FoldedRecur(Flux.GRUCell(args...; kwargs...))
Recur(m::GRUCell) = Recur(m, m.state0)

# TODO remove in v0.13
Expand Down Expand Up @@ -281,6 +324,7 @@ See [this article](https://colah.github.io/posts/2015-08-Understanding-LSTMs/)
for a good overview of the internals.
"""
GRUv3(a...; ka...) = Recur(GRUv3Cell(a...; ka...))
FoldedGRUv3(args...; kwargs...) = FoldedRecur(Flux.GRUv3Cell(args...; kwargs...))
Recur(m::GRUv3Cell) = Recur(m, m.state0)


Expand Down