-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conformer misses relative pos encoding #132
Comments
See here for an example: https://github.com/espnet/espnet/blob/4138010fb66ad27a43e8bee48a4932829a0847ae/espnet2/asr/encoder/conformer_encoder.py#L132
The
|
Note that all our existing Conformer recipes in RETURNN currently use the RETURNN But for direct comparisons to old setups, we might also need to implement the old variant. |
Interestingly, the Fairseq |
For reference, the original Transformer-XL |
Hi @albertz, I think the Fairseq RelPositionalEncoding is based on ESPnet as they distinguish different RPE/Attention types between Fairseq and ESPnet here: https://github.com/facebookresearch/fairseq/blob/b7b7928065ec90bef8f9a489b0128a4dce560d57/fairseq/modules/conformer_layer.py#L187. |
@albertz The fairseq implementation is based on the Espnet one for relative positional encoding. Seems like we are indeed missing some citations. Will make sure to add them soon. Thanks! |
There is self attention without any pos encoding.
Related is Transformer rel pos #74.
There probably should be a
nn.RelPosSelfAttention
, similar asnn.SelfAttention
but with rel pos encoding. It should use the Transformer XL rel pos encoding. (We can separately test other rel pos encodings later.)The text was updated successfully, but these errors were encountered: