Open
Description
Paper
Link: https://arxiv.org/pdf/1910.10288.pdf
Year: 2019
Summary
- using simple location-relative attention mechanisms to do away with content-based query/key comparisons, to handle out-of-domain text
- introduce a new location-relative attention mechanism to the additive energy-based family, called Dynamic Convolution Attention (DCA)
Results
- Dynamic Convolution Attention and V2 GMM attention with initial bias (GMMv2b) are able to generalize to utterances much longer than those seen during training, while preserving naturalness on shorter utterances
- improved speed and consistency of alignment during training
- advantage of DCA over GMM attention is:
- DCA can more easily bound its receptive field, which makes it easier to incorporate
hard windowing optimizations in production - its attention weights are normalized, which helps to stabilize the alignment, especially for coarsegrained alignment tasks