add hifi-gan #4

cassiotbatista · 2023-04-24T12:01:10Z

IIUC FastSpeech 2 is just the acoustic model that converts from text (or textual features including phonemes and other metadata) to spectrograms.

However to effectly synthesize speech one does actually need a vocoder to convert spectrograms to a signal in the time domain. IIRC HiFi-GAN is the one mentioned in the paper, so...

cassiotbatista added the enhancement New feature or request label Apr 24, 2023

cassiotbatista self-assigned this Apr 24, 2023

cassiotbatista mentioned this issue Apr 24, 2023

spectral similarity distance #5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add hifi-gan #4

add hifi-gan #4

cassiotbatista commented Apr 24, 2023

add hifi-gan #4

add hifi-gan #4

Comments

cassiotbatista commented Apr 24, 2023