We use sequence models to encode textual data/decode responses and plans to experiment with various RL agents on the Cornell Movie Dialogues dataset. Starting from Seq2Seq model as the baseline, transformer was adapted for faster training process and better metric value. Furthermore, we also expect positive emotional response with the reinforcement learning. Finally, deploy the best model after tuning into a webiste/app would be worthwhile to try.
Still working on this project...