Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
1686: Adding support for folding RNNs over 3d arrays r=DhairyaLGandhi a=mkschleg From #1678, adding a Recur like interface for a folded operation with support for 3-dimensional arrays. This is how many users expect RNNs to work if they are familiar with Pytorch and Tensorflow, and there seems to be some desire for support for this feature as per the discussion in #1671 and `@jeremiedb` . This will also make a push to implementing support for the CuDNN versions of RNNs/GRUs/LSTMs more streamlined as this is the data layout that API expects. I did a barebones implementation to add support so we can start iterating on API. There are several questions that I have lingering with this interface: - ~Should we support different modes where we return all or only the last hidden state? Is there a better way to do the concat of the hidden states?~ - What kind of tests should we have? Just follow what we currently do for RNNs/LSTMs/GRUs? - ~For the CPU version, does it make sense not to specialize on the different rnn types? We might be able to take more advantage of BLAS if we specialized on say `Folded{GRU}`.~ - ~Do we want to force the temporal dimension to be the 2nd?~ - ~Do we want this to be stateful? (i.e. allow the user to change what the starting hidden state is rather than state0).~ ### PR Checklist - [x] Tests are added - [ ] Entry in NEWS.md - [x] Documentation, if applicable - [ ] API changes require approval from a committer (different from the author, if applicable) Co-authored-by: Matthew Schlegel <mkschleg@gmail.com> Co-authored-by: Matthew Schlegel <mkschleg@users.noreply.github.com> Co-authored-by: Dhairya Gandhi <dhairya@juliacomputing.com>
- Loading branch information