A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
-
Updated
Nov 15, 2024 - Python
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASSP 2020)
A curated list of awesome papers on contextualizing E2E ASR outputs
An efficient implementation of RNN-T Prefix Beam Search in C++/CUDA.
An implementation of RNN-Transducer loss in TF-2.0.
I'm building an end-to-end Vietnamese Speech Recognition System. I'll deploy it into production with the help of Flask, Uwsgi, Nginx, and AWS ...
Pure PyTorch implementation of the loss described in "Online Segment to Segment Neural Transduction" https://arxiv.org/abs/1609.08194
Add a description, image, and links to the rnnt topic page so that developers can more easily learn about it.
To associate your repository with the rnnt topic, visit your repo's landing page and select "manage topics."