Implementation of Convolutional Recurrent Neural Nets for zero-shot retrieval of images based on corresponding captions.
CRNNs consist of 1D Convolutional blocks followed by a RNN. Convolutions decrease the sequence length of the captions, allowing the RNN to learn efficiently.

