Skip to content

Improved Noisy Student Training for Automatic Speech Recognition #33

Open
@jinglescode

Description

@jinglescode

Paper

Link: https://arxiv.org/pdf/2005.09629v1.pdf
Year: 2020

Summary

  • adapt and improve noisy student training for automatic speech
    recognition (noisy student training is an iterative self-training method that leverages augmentation to improve network performance)

Methods

  • employ (adaptive) SpecAugment, an augmentation method for ASR that directly acts on the spectrogram of the input audio, for noisy student training
  • use shallow fusion with a language model on the teacher network to generate better transcripts for the student network to train on
  • propose a normalized filtering score for transcripts generated by teacher networks given as a function of the fusion score and number of tokens
  • use a variant of sub-modular sampling to weigh the utterance-transcript pairs generated by the teacher network to balance the token statistics of the dataset to be passed on to the student

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions