-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Experimental Released Models
Eren Gölge edited this page May 25, 2021
·
4 revisions
TTS Models | Dataset | Commit | Audio Sample | Details |
---|---|---|---|---|
Tacotron2 | LJSpeech | branch | --- | Details |
Tacotron2 DDC | LJSpeech | 72a6ac5 | voice samples | Trained with DDC and includes PyTorch, Tensorflow and TFLite models. Check Colab notebooks or notebooks folder. |
Glow-TTS | LJSpeech | 08394e4 | --- | Details. Sample notebook |
Multi-Speaker-Tacotron2 | VCTK | 4873601 | Colab notebook | Multi-Speaker TTS model with Tacotron2/ |
Multi-Speaker-Tacotron2 DDC | VCTK | 2136433 | Colab notebook | Multi-Speaker TTS model with Tacotron2 and Double Decoder Consistency. |
Tacotron2 with Dynamic Conv Attention | LJSpeech | 4132240 | Colab notebook | Tacotron2 with Dynamic Convolutional Attention. |
Glow-TTS | LJSpeech | 4132240 | Colab notebook | Glow-TTS as in the paper. |
Speaker Encoder Models | Dataset | Commit |
---|---|---|
Speaker-Encoder-iter25k | LibriSpeech | ... |
Speaker-Encoder by @mueller91 | LibriTTS + VCTK + VoxCeleb + CommonVoice | ... |
Vocoder Models | Dataset | Commit | Details |
---|---|---|---|
ParallelWaveGAN | LJSpeech | 72a6ac5 | Trained using TTS.vocoder . It produces better results than MelGAN model but it is slightly slower. Check notebooks for testing. |
Multi-Band MelGAN | LJSpeech | 72a6ac5 | Trained using TTS.vocoder . It is the fastest vocoder model. Check notebooks for testing. |
WaveRNN models | go to repo for the models. (Soon to be deprecated) | ||
Full-Band MelGAN | LibriTTS | c514628 | Trained using TTS.vocoder . Generic vocoder that can sample any voice. Sampling rate 24Khz. To use with a different sampling rate follow this issue. |
Universal WaveGrad | LibriTTS | 2136433 | Trained using TTS.vocoder . Generic vocoder that can sample any voice. Original Sampling rate 24Khz. To use with a different sampling rate follow this issue. |
Universal HifiGAN | LibriTTS | - |
How to use:
- Create a fresh virtual environment with Python 3.6
$ apt-get install espeak libsndfile1
$ pip install python_package_url_from_table_below
$ python -m TTS.server.server
- Open http://localhost:5002
Model | Dataset | Python package | nginx/uWSGI config files |
---|---|---|---|
Tacotron 2 + Forward Attention + PWGAN | LJSpeech | TTS-0.0.1+92aea2a-py3-none-any.whl | tts-nginx-uwsgi.zip |
The server is a Flask application. For deployment with multiple workers see the nginx/uWSGI config files also linked in the table above. Pass --use_cuda 1
to use GPUs when available.
TTS Models | Dataset | Commit | Audio Sample | Details |
---|---|---|---|---|
Tacotron2 DDC | MAI-Labs | 48a40c4 | --- | Model Details and Colab Notebook. |
TTS Models | Dataset | Commit | Audio Sample | Details |
---|---|---|---|---|
Tacotron2 DDC | MAI-Labs | f09defa | --- | Model Details and Colab Notebook. |
model details by @Edresson