This project implements the Variational LSTM sequence to sequence architecture for a sentence auto-encoding task. In general, I follow the paper "Variational Recurrent Auto-encoders" and "Generating Sentences from a Continuous Space". Most of the implementations about the variational layer are adapted from "y0ast/VAE-torch".
Following the above two papers, the variational layer is only added in between the last hidden state of the encoder and the first hidden state of the decoder, with the following steps:
-
Compute mean and variance of the posterior q from the last hidden state, with a 2-layer mlp encoder
-
Compute KLD loss between the estimated posterior q(z|x) and the enforced prior p(z)
-
Collect a noise sample with reparameterization
-
Get the first hidden state of the decoder with a 2-layer mlp decoder
This code requires Torch7 and nngraph
- training on GPU: th VLSTM-Autoencoder.lua -gpuid 0
- sampling on GPU: th sample.lua -gpuid 0 -cv cv/checkpoint -data dataset/test