HifiGAN training -- obvious harmonics in test files #339

Kristopher-Chen · 2022-03-14T03:00:54Z

I trained HifiGAN on VCTK multi-speaker datasets with 24kHz sampling rate. I also do normalization in the input log-Mel spectrogram (with mean=-4, std=4)， and found obvious harmonics in the test files as below.
Have you ever met this? Any suggestions? Thank you!

Kristopher-Chen · 2022-03-14T03:02:00Z

BTW, the discriminators' loss is quite small, which may suggest the discriminators are too strong?

kan-bayashi · 2022-03-14T11:02:58Z

Did you use this repository? Or general question about hifigan?

Kristopher-Chen · 2022-03-15T03:39:15Z

Did you use this repository? Or general question about hifigan?

Hi, actually I referred to your repository and the official version. I trained several epochs by the official code and find the discriminators' losses around 0.1~0.2, but also with obvious harmonics. So I wonder if this happens in early training stages? But the discriminators' loss is quite strange...

kan-bayashi · 2022-03-15T05:29:59Z

OK. How many iterations did you run? In my experiment, around 200k iters can generate reasonable voice.
I'm not familiar with official implementation but in my case official optimizer setting does not work well.
The following issue may help you.
#278

Kristopher-Chen · 2022-04-06T09:55:57Z

OK. How many iterations did you run? In my experiment, around 200k iters can generate reasonable voice. I'm not familiar with official implementation but in my case official optimizer setting does not work well. The following issue may help you. #278

There seems something wrong with the discriminators. The losses get smaller after more epochs. The normal values would be around 0.1~0.2 for each discriminatory, but mine is as below.

MlWoo · 2023-02-22T09:56:19Z

@Kristopher-Chen have you resolved the problems?

Kristopher-Chen · 2023-05-11T01:43:37Z

@Kristopher-Chen have you resolved the problems?

when I refer to the original codes, this problem is solved.

For discriminator losses, the 2nd and 3rd MSD losses are easily becoming small, and others look normal.

Moreover, the feature map loss keeps growing gradually. But interestingly, the generated samples sound natural...

kan-bayashi added the question Further information is requested label Mar 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HifiGAN training -- obvious harmonics in test files #339

HifiGAN training -- obvious harmonics in test files #339

Kristopher-Chen commented Mar 14, 2022

Kristopher-Chen commented Mar 14, 2022

kan-bayashi commented Mar 14, 2022

Kristopher-Chen commented Mar 15, 2022

kan-bayashi commented Mar 15, 2022

Kristopher-Chen commented Apr 6, 2022

MlWoo commented Feb 22, 2023

Kristopher-Chen commented May 11, 2023 •

edited

Loading

HifiGAN training -- obvious harmonics in test files #339

HifiGAN training -- obvious harmonics in test files #339

Comments

Kristopher-Chen commented Mar 14, 2022

Kristopher-Chen commented Mar 14, 2022

kan-bayashi commented Mar 14, 2022

Kristopher-Chen commented Mar 15, 2022

kan-bayashi commented Mar 15, 2022

Kristopher-Chen commented Apr 6, 2022

MlWoo commented Feb 22, 2023

Kristopher-Chen commented May 11, 2023 • edited Loading

Kristopher-Chen commented May 11, 2023 •

edited

Loading