Skip to content

2022-05-31

Latest
Compare
Choose a tag to compare
@comodoro comodoro released this 04 Jun 19:42
· 2 commits to master since this release

A new release with better error rates, largely from the same data as the previous one.

Metrics*:

  1. Raw acoustic model (without a scorer)
  • Czech Commonvoice 6.1 test dataset: WER: 0.405500, CER: 0.106870, loss: 15.227368
  • Vystadial 2016 test dataset: WER: 0.506131, CER: 0.195149, loss: 17.695986
  • Large Corpus of Czech Parliament Plenary Hearings test dataset: WER: 0.213377, CER: 0.052676, loss: 20.449242
  • ParCzech 3.0 test dataset: WER: 0.209651, CER: 0.061622, loss: 28.217770
  1. With the attached czech-large-vocab.scorer:
  • Czech Commonvoice 6.1 test dataset: WER: 0.152865, CER: 0.067557, loss: 15.227368**
  • Vystadial 2016 test dataset: WER: 0.357435, CER: 0.201479, loss: 17.695986
  • Large Corpus of Czech Parliament Plenary Hearings test dataset: WER: 0.097380, CER: 0.036706, loss: 20.449242
  • ParCzech 3.0 test dataset: WER: 0.101289, CER: 0.045102, loss: 28.217770

Metrics for the quantized model are circa one percent worse.

*Any clips longer than thirty seconds were discarded
**Better than expected results on the common voice set with the language model might possibly be explained by a partial overlap of the test transcriptions and language model sources, namely Wikipedia and Europarl v7.