Improvements for Self-Attention, which is built based on THUMT
Please cite the following paper:
Modeling Localness for Self-Attention Networks. Baosong Yang, Zhaopeng Tu, Derek Wong, Fandong Meng, Lidia Chao and Tong Zhang. In EMNLP 2018.
Context-Aware Self-Attention Networks. Baosong Yang, Jian Li, Derek Wong, Lidia Chao, Xing Wang and Zhaopeng Tu. In AAAI 2019.
Convolutional Self-Attention Networks. Baosong Yang, Longyue Wang, Derek Wong, Lidia Chao and Zhaopeng Tu. In NAACL 2019.
Context-Aware Self-Attention Networks for Natural Language Processing. Baosong Yang, Longyue Wang, Derek Wong, Shuming Shi and Zhaopeng Tu. In Neurocomputing.