Skip to content

Conversation

@sanchit-gandhi
Copy link
Contributor

@sanchit-gandhi sanchit-gandhi commented Sep 4, 2023

What does this PR do?

Fixes #24085 (comment). In short, the VITS initialisation test was flaky: some of the parameters initialised with uniform values would exceed the initialiser range [-1, 1].

These violating parameters would always be in the latter stages of the HiFi GAN vocoder, so we can assume this was due to an accumulation of numerical errors.

This PR reduces the size of the HiFi GAN vocoder by a factor of 2, negating the accumulation of these errors. The test now passes over 20 iterations, but we should watch out if it turns out flaky over a larger range.


result = model(input_ids, attention_mask=attention_mask)
self.parent.assertEqual(result.waveform.shape, (self.batch_size, 11008))
self.parent.assertEqual((self.batch_size, 624), result.waveform.shape)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output dims change after modifying the HiFi GAN vocoder

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Sep 4, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me.

I ran it 100 times and got 8 failures.

Maybe add something like

@is_flaky(max_attempts=3, description="...(some comment)...")

@ylacombe
Copy link
Contributor

ylacombe commented Sep 4, 2023

Hi @sanchit-gandhi , I'm not sure how accumulation of numerical errors could appear during initialization, could you expand on this a bit?

Also, would it be possible that the error comes from the init_weights's lack of ConvTranspose1d initialization ?

Copy link
Contributor

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

@sanchit-gandhi
Copy link
Contributor Author

The failing module is actually a vanilla conv1d layer (rather than a conv transpose):

self.convs1 = nn.ModuleList(

I haven't looked too deep into it, but I presumed the error was occurring due to using the kaiming normal intialiser for the conv layers:

nn.init.kaiming_normal_(module.weight)

Feel free to dive into it more if you want to find the source error! But based on the values we're getting it just looks like an instance of flakiness (the mean weights are 1.03 instead of 1.0)

sanchit-gandhi and others added 2 commits September 4, 2023 13:19
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
@sanchit-gandhi sanchit-gandhi merged commit d750eff into huggingface:main Sep 4, 2023
@sanchit-gandhi sanchit-gandhi deleted the vits-test branch September 4, 2023 16:09
parambharat pushed a commit to parambharat/transformers that referenced this pull request Sep 26, 2023
* [VITS] Fix init test

* add flaky decorator

* style

* max attempts

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* style

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants