Skip to content

Retune parameters for CTC beam search decoder #218

@kuke

Description

@kuke

The beam search decoder for deployment in PR#139 takes advantage of trie tree as the data structure for prefix search and finite-state transducers for spelling correction, which speedup the decoding process and lower the WER. With a larger (compared with the model in #115 ) well-trained acoustic model, parameters alpha and beta for the decoder are retuned on the development dataset of LibriSpeech, as shown in the figure below.

tune on larger model

  • alpha: language model weight
  • beta: word insertion weight
  • WER: word error rate

As usual, the WER is mainly affected by the variation of parameter alpha. And the optimal parameters pair appears at (alpha,beta) = (2.15, 0.35), which produces a minimum WER 7.87% on the test dataset of LibriSpeech, and attenuates the WER by 0.8% compared to the prototype decoder in Python.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions