Skip to content

A Generative Flow for Text-to-Speech via Monotonic Alignment Search

License

Notifications You must be signed in to change notification settings

eirene-aisa/glow-tts-practice

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Glow-TTS Official Repository

Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Multispeaker enabled Glow-tts

Glow-tts with korean cleaner, enabled multispeaker training (reffering to some of issues).

This repo recommended to be used as a reference for multispeaker training.

_custom : executed with korean cleaners.

_custom_multi : executed with korean cleaners, for multispeaker training.

Single korean speaker demo with KSS is available. link

Korean cleaner

Solved issues

  • Due to apex(commit: 37cdaf4) dependency, I used pytorch 1.3.0 (instead of 1.2.0)

  • For multispeaker setting

    • filelist should be in followed format.

      audio_path(*.wav)|speaker_id|transcript related issue

    • Add n_speakers, gin_channels to config is recommended. related issue

    • (TextMelLoader, TextMelCollate) should be replaced with (TextMelSpeakerLoader, TextMelSpeakerCollate) in init.py, train.py

      Also, edit (x, x_lengths, y, y_lengths) to (x, x_lengths, y, y_lengths, g).

    • Usage of speaker information(g) should be delievered explicitly to FlowGenerator. related issue

      generator(x=x, x_lengths=x_lengths, y=y, y_lengths=y_lengths, g=g, gen=False) (I do not know why)

  • 'Gradient overflow' might be caused due to data problem. related issue

1. Environments (edited)

  • Python==3.6.9
  • pytorch==1.3.0
  • cython==0.29.12
  • librosa==0.7.1
  • numpy==1.16.4
  • scipy==1.5.4
  • nltk==3.6.5

2. Pre-requisites

Please check official repository.

3. Training Example

sh train_custom_multi_ddi.sh configs/base.json base

4. Inference Example

See inference.ipynb

About

A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 93.9%
  • Jupyter Notebook 5.2%
  • Other 0.9%