Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training recipes for thorsten dataset #1020

Merged
merged 3 commits into from
May 30, 2022
Merged

Training recipes for thorsten dataset #1020

merged 3 commits into from
May 30, 2022

Conversation

noranraskin
Copy link
Contributor

Basically just updated the recipes from https://github.com/coqui-ai/TTS-recipes to the new coqpit version of writing training scripts.

@CLAassistant
Copy link

CLAassistant commented Dec 15, 2021

CLA assistant check
All committers have signed the CLA.

@thorstenMueller
Copy link
Contributor

Hi @noranraskin,
thanks for this PR. I'm not sure if @erogol accepts PRs directly to main instead of dev, but we'll see.

Just as side note:
My new neutral dataset (with more natural speech flow) is growing and could be released on early 2022.

@erogol erogol changed the base branch from main to dev December 20, 2021 09:44
@erogol
Copy link
Member

erogol commented Dec 20, 2021

thanks for the ✨PR✨ @noranraskin

@erogol
Copy link
Member

erogol commented Dec 20, 2021

How about using the new dataset downloader instead of the bash script?

def download_thorsten_de(path: str):

@stale
Copy link

stale bot commented Jan 19, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

@stale stale bot added the wontfix This will not be worked on but feel free to help. label Jan 19, 2022
@thorstenMueller
Copy link
Contributor

Hi @noranraskin, what do you think about @erogol s suggestion?

@stale stale bot removed the wontfix This will not be worked on but feel free to help. label Jan 20, 2022
@erogol
Copy link
Member

erogol commented Feb 11, 2022

@noranraskin are there more recipes 😄 on the way or should I just merge away?

@noranraskin
Copy link
Contributor Author

noranraskin commented Feb 13, 2022

@erogol there's more recipes on the way. But I'm currently getting an error additional parameter 'ignored_speakers:[] given I also get this on the previously tacotron2-DDC script and all other model recipes, vocoders work fine.
I pulled the latest changes from main a few days ago, any ideas?

@erogol
Copy link
Member

erogol commented Mar 6, 2022

Can you try the latest dev? (sorry for the delayed response)

@noranraskin noranraskin closed this Mar 8, 2022
@noranraskin noranraskin reopened this Mar 8, 2022
@noranraskin
Copy link
Contributor Author

@erogol **kwargs was missing in the thorsten formatter. I fixed that issue. Now I'm getting

Traceback (most recent call last):
  File "train_tacotron_ddc.py", line 71, in <module>
    train_samples, eval_samples = load_tts_samples(dataset_config, eval_split=True)
  File "/data/nra/tts-finetuning/TTS/TTS/tts/datasets/__init__.py", line 112, in load_tts_samples
    meta_data_train = [{**item, **{"language": language}} for item in meta_data_train]
  File "/data/nra/tts-finetuning/TTS/TTS/tts/datasets/__init__.py", line 112, in <listcomp>
    meta_data_train = [{**item, **{"language": language}} for item in meta_data_train]
TypeError: 'list' object is not a mapping

On latest main and dev. I'm not getting this error when I try LJspeech. I think this is because some problem with the formatter or data importer, as this dataset only has a single speaker.

@erogol
Copy link
Member

erogol commented Mar 10, 2022

New formatters return dictionaries not lists.

@a-froghyar
Copy link
Contributor

Hey, just chipping in with a question here, these recipes won't work without a german text cleaner implemented first, no?

@@ -0,0 +1 @@
arabic-speech-corpus
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this in 🐸TTS, as you can simply move it to somewhere else in your system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry my mistake

@erogol
Copy link
Member

erogol commented May 12, 2022

@noranraskin looks good to me. I can merge it when you are ready :)

@noranraskin
Copy link
Contributor Author

I was still running in some torch errors. I'm updating and testing all the recipes again to make sure everything is working. Might also add a couple more...

@noranraskin
Copy link
Contributor Author

noranraskin commented May 18, 2022

@erogol so I only got tacotron2_DDC to work.
GlowTTS throws a really long stacktrace with some nvrtc compilation of "default_program" failing.
And all the vocoders through the same cuFFT error at epoch 0.
This could have something to do with my installation, since I'm running a nightly build of torch, as I couldn't downgrade cuda on my system.
All the recipes are copied from ljspeech so I'm running into the same issues there too. Maybe somebody else can check them and report back (haven't opened an issue yet because of the nature of my installation)

Plus:
I added a dataset check at the beginning of all the recipes, that downloads the dataset if it can't be found

@loganhart02
Copy link
Contributor

loganhart02 commented May 19, 2022

@erogol so I only got tacotron2_DDC to work. GlowTTS throws a really long stacktrace with some nvrtc compilation of "default_program" failing. And all the vocoders through the same cuFFT error at epoch 0. This could have something to do with my installation, since I'm running a nightly build of torch, as I couldn't downgrade cuda on my system. All the recipes are copied from ljspeech so I'm running into the same issues there too. Maybe somebody else can check them and report back (haven't opened an issue yet because of the nature of my installation)

Plus: I added a dataset check at the beginning of all the recipes, that downloads the dataset if it can't be found

I was able to train glow-tts and hifigan with the recipes past epoch 0. what cuda are you using and have you tried running it on a conda environment with a different cuda?

@noranraskin
Copy link
Contributor Author

I was running CUDA 11.6 with the compatible pytorch nightly build. But I realised, I was always fetching from 'main' so working with pretty old code. I fixed my installation and tested everything again and now it's working for me too

@noranraskin
Copy link
Contributor Author

The new recipes for align_tts, vits, wavegrad and wavernn are tested and working.
I'm still getting a RuntimeError for speedy_speech:
Calculated padded input size per channel: (7). Kernel size: (13). Kernel size can't be greater than actual input size.

If you don't know a fix right away I'd just discard this one for now.
Also I'd be ready to merge now, don't have anything to add for anymore.

@thorstenMueller
Copy link
Contributor

Hi @noranraskin ,
first of all thanks for your efforts to add training recipes for my Thorsten dataset 👏.

When the new Thorsten models trained by @domcross and me are released (soon to happen) our next step is the preparation of the dataset release which is the base of the new models. Maybe we can add the recipes to your existing structure then 😊.

Here's a comparison of a model based on the current and the new dataset.
https://www.thorsten-voice.de/2022/03/20/vergleich-thorsten-aktuell-mit-dem-neuen-modell/

@loganhart02
Copy link
Contributor

The new recipes for align_tts, vits, wavegrad and wavernn are tested and working. I'm still getting a RuntimeError for speedy_speech: Calculated padded input size per channel: (7). Kernel size: (13). Kernel size can't be greater than actual input size.

If you don't know a fix right away I'd just discard this one for now. Also I'd be ready to merge now, don't have anything to add for anymore.

Ill see if I can fix the speedy_speech error real quick and then merge :)

@erogol erogol merged commit a790df4 into coqui-ai:dev May 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants