Simplified upsampling #4

bshall · 2019-03-19T11:37:43Z

Hi @geneing, thanks for all your hard work! I was wondering why you decided to abandon the simplified upsampling in your model_simplification branch. Was the audio quality significantly worse?

geneing · 2019-03-19T17:04:10Z

@bshall Well, the main reason for simplified upsampling was to improve data flow. The upsampling part contains a 5 tap convolution, which requires padding the input mels on both sides with at least 2 empty frames on each side. It adds significant amount of work when doing parallel synthesis (by splitting the input mels in time and synthesizing in parallel - each piece has to be padded), and one has to be very careful when stitching padded waveform pieces together.

It turned out that network based upsampling is actually shifting the mels in time a bit, which simple interpolation wasn't doing. This resulted in slightly lower quality speech.

Keep in mind that upsampling is a tiny part of overall timing. Most of the work is done in RNN and post-net FC layers.

I'm starting to thing about implementing streaming synthesis for the C++ library (i.e. don't wait for all the mel frames to be ready, instead generate as mel frames are added), so I may take another look at upsampling to avoid doing convolutions.

bshall · 2019-03-20T07:07:54Z

Thanks for the response @geneing. Yeah, streaming synthesis would be really cool. I was wondering whether simple "nearest" upsampling would be good enough to replace the upsampling network.

kilolgupta · 2019-04-15T23:10:23Z

Hi @geneing
I was wondering if you made any progress with the streaming synthesis. I am trying to do something similar to better estimate the inference time/First Time To Response, and the achieved improvements using very helpful techniques that you suggested.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplified upsampling #4

Simplified upsampling #4

bshall commented Mar 19, 2019

geneing commented Mar 19, 2019 •

edited

Loading

bshall commented Mar 20, 2019

kilolgupta commented Apr 15, 2019

Simplified upsampling #4

Simplified upsampling #4

Comments

bshall commented Mar 19, 2019

geneing commented Mar 19, 2019 • edited Loading

bshall commented Mar 20, 2019

kilolgupta commented Apr 15, 2019

geneing commented Mar 19, 2019 •

edited

Loading