Skip to content

Transformers with convolutional context for ASR #59

Open
@jinglescode

Description

@jinglescode

Paper

Link: https://arxiv.org/pdf/1904.11660.pdf
Year: 2020

Summary

  • replacing the sinusoidal positional embedding for transformers with convolutionally learned input representations
  • fixed learning rate of 1.0 and no warmup steps

Methods

  • 2 parts
    • learning local relationships within a small context with convolutional layers
    • learning global sequential structure of the input with transformer layers
  • use conv to learn an acoustic language model over the bag of discovered acoustic units as it goes deeper in the encoder

code: github.com/pytorch/fairseq/tree/master/examples/speech recognition

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions