Adding neural HMM TTS #2271

shivammehta25 · 2023-01-09T11:10:57Z

Neural HMM TTS (2021) is a parameter-efficient version of OverFlow, it was the probabilistic spectrogram generation model which introduced neural HMMs into a TTS architecture in "Neural HMMs are all you need (for high-quality attention-free TTS)". Because of half the number of parameters the synthesis output quality is suboptimal (but comparable to Tacotron2 without Postnet) but it learns to speak with a lesser amount of data and is significantly faster than other attention-based methods.

Paper arXiv
Paper ICASSP 2022
Demo: Webpage

…del instead

… and can be dumped as json

* Draft implementation * Fix style * Add api tests * Fix lint * Update docs * Update tests * Set env * Fixup * Fixup * Fix lint * Revert

quick fix for coqui-ai#2156. added 'root_path' key.

* Add YourTTS VCTK recipe * Fix lint * Add compute_embeddings and resample_files functions to be able to reuse it * Add automatic download and speaker embedding computation for YourTTS VCTK recipe * Add parameter for eval metadata file on compute embeddings function

…2206)

* Adding pretrained Overflow model * Stabilize HMM * Fixup model manager * Return `audio_unique_name` by default * Distribute max split size over datasets * Fixup eval_split_size * Make style

* Update overflow config * Pulling shuffle and drop_last from config * Print training stats for overflow

TTS/VERSION

TTS/tts/models/overflow.py

shivammehta25 and others added 30 commits November 26, 2022 22:09

Adding encoder

405bffe

currently modifying hmm

d607993

Adding hmm

a324920

Adding overflow

8628648

Adding overflow setting up flat start

6ec83c4

Removing runs

783a982

adding normalization parameters

10f15e0

Fixing models on same device

aff8b1f

Training overflow and plotting evaluations

62941d6

Adding inference

f448ea4

At the end of epoch the test sentences are coming on cpu instead of gpu

ff33837

Adding figures from model during training to monitor

3edb0d2

reverting tacotron2 training recipe

5fc800c

fixing inference on gpu for test sentences on config

427dfe5

moving helpers and texts within overflows source code

ecc12c6

renaming to overflow

b86f3f8

moving loss to the model file

995ee93

Fixing the rename

5b0fe46

Model training but not plotting the test config sentences's audios

5377f87

Formatting logs

bd5be6c

Changing model name to camelcase

755aa6f

Fixing test log

1350a4b

Fixing plotting bug

3c986fd

Adding some tests

4a5b1a0

Merge branch 'coqui-ai:dev' into dev

5b1dabc

Adding more tests to overflow

f43d7e3

Adding all tests for overflow

c3d0167

making changes to camel case in config

ddefe34

Adding information about parameters and docstring

c2df9f3

removing compute_mel_statistics moved statistic computation to the mo…

9927434

…del instead

shivammehta25 and others added 26 commits December 23, 2022 10:21

Fixing test log

6e08e4f

Fixing plotting bug

9394ce0

Adding some tests

e115361

Adding more tests to overflow

7a541b9

Adding all tests for overflow

1dccc29

making changes to camel case in config

1b1bf1f

Adding information about parameters and docstring

916b98e

removing compute_mel_statistics moved statistic computation to the mo…

6eff37c

…del instead

Added overflow in readme

8a8dd1d

Adding more test cases, now it doesn't saves transition_p like tensor…

e738c0c

… and can be dumped as json

Handle espeak 1.48.15 (coqui-ai#2203)

479c0cf

Python API implementation (coqui-ai#2195)

4f02e2c

* Draft implementation * Fix style * Add api tests * Fix lint * Update docs * Update tests * Set env * Fixup * Fixup * Fix lint * Revert

Update README (coqui-ai#2204)

89b9868

Adding missing key to formatter (coqui-ai#2194)

684adb0

quick fix for coqui-ai#2156. added 'root_path' key.

Add Original YourTTS vocabulary for full transfer learning (coqui-ai#…

a0be902

…2206)

uncommenting the approximation to stablize the training

f3fe409

Adding pre-trained Overflow model (coqui-ai#2211)

aedd795

* Adding pretrained Overflow model * Stabilize HMM * Fixup model manager * Return `audio_unique_name` by default * Distribute max split size over datasets * Fixup eval_split_size * Make style

Fixup overflow (coqui-ai#2218)

253b03f

* Update overflow config * Pulling shuffle and drop_last from config * Print training stats for overflow

Bump up to v0.10.0

c2ce4fb

Add Ukrainian LADA (female) voice

fd5ad8c

Merge branch 'coqui-ai:dev' into dev

1260c7f

Merge branch 'coqui-ai:dev' into dev

f73cd29

Merge branch 'dev' of github.com:shivammehta25/TTS into dev

2abbc97

Adding a config flag to train neural HMM TTS instead of overflow

790b846

Backwards compatibility: Fixing model zoo if the flag is not set, set it

a8d0b22

shivammehta25 commented Jan 9, 2023

View reviewed changes

TTS/VERSION Show resolved Hide resolved

shivammehta25 commented Jan 9, 2023

View reviewed changes

TTS/tts/models/overflow.py Show resolved Hide resolved

shivammehta25 closed this Jan 9, 2023

shivammehta25 mentioned this pull request Jan 9, 2023

Adding neural HMM TTS Model #2272

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding neural HMM TTS #2271

Adding neural HMM TTS #2271

shivammehta25 commented Jan 9, 2023

Adding neural HMM TTS #2271

Adding neural HMM TTS #2271

Conversation

shivammehta25 commented Jan 9, 2023