afrTTS - Afrikaans PyTorch-based G2P and text-to-speech synthesis

Welcome to afrTTS! We implement two systems:
-NaiveTTS uses an existing limited pronunciation dictionary and suffers from misallignment.
-G2PxTTS uses a G2P conversion model to expand this dictionary to create more coherent audio.

Installs

pip install librosa
pip install univoc
pip install tacotron
pip install omegaconf
pip install torch

Tacotron
A series pretrained weights are avaible at https://github.com/JulianHerreilers/pantoffel_tacotron_models_storage only the following two should be used:
-NaiveTTS: "https://github.com/JulianHerreilers/pantoffel_tacotron_models_storage/releases/download/v0.190k-210k-230k-beta/model-230000.
-G2PxTTS: "https://github.com/JulianHerreilers/pantoffel_tacotron_models_storage/releases/download/v1.120epoch/model-300000.pt"

Tacotron can be trained by running the following preprocessing and training commands:
First adjust the first argument in line 32 of utils/jsonmaker.py to metadata_incomplete.csv then run the following commands:

python utils/jsonmaker.py
python preprocess.py afrZA datasets/afrZA
python train.py afrza afrZA/metadata_incomplete.csv datasets/afrZA

The G2P model can be trained and used to expand the pronuncation dictionary all from the notebook at G2P/G2P_LSTM.ipynb.

A demo notebook afrTTS_demo.pynb can be used to test out the two systems provided that demo_utils.py, g2pmodel.py and the two dictionaries, afr_za_dict.txt and rcrl_apd.1.4.1.txt are in the same directory.

Further datasets and alogrithms are available in utils/:

-split_num_letters.py converts all numbers in a sequence to their word equivalents.
-demo_sample_randomizer.ipynb was used to sort the demo samples for the subjective evaluation.
-check_valid_entries complete.py returns the remainder of the dataset that remains in the selected dictionary, whether afr_za_dict.txt or rcrl_apd.1.4.1.txt.

Acknowledegments:
-https://github.com/bshall/Tacotron
-https://github.com/bshall/UniversalVocoding
-Computations were performed using the University of Stellenbosch's HPC1 (Rhasatsha): http://www.sun.ac.za/hpc

Please not this was my first proper exposure to PyTorch and Deep Models (I would probably approach this very differently with the knowledge I now have (but hey, that's learning :)). I tried to document it as well as possible to explain my thought process but use at your own risk.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

afrTTS - Afrikaans PyTorch-based G2P and text-to-speech synthesis

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
G2P		G2P
afrZA		afrZA
afrza		afrza
best_models		best_models
datasets/afrZA		datasets/afrZA
tacotron		tacotron
utils		utils
README.md		README.md
afrTTS_demo.ipynb		afrTTS_demo.ipynb
afr_za_dict.txt		afr_za_dict.txt
demo_utils.py		demo_utils.py
g2pmodel.py		g2pmodel.py
preprocess.py		preprocess.py
rcrl_apd.1.4.1.txt		rcrl_apd.1.4.1.txt
test.py		test.py
train.py		train.py

JulianHerreilers/afrTTS

Folders and files

Latest commit

History

Repository files navigation

afrTTS - Afrikaans PyTorch-based G2P and text-to-speech synthesis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages