-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to execute denoising? #35
Comments
Line 60 in 5ec3ab9
Is the denoising process the same as that of the autoencoder. Does it require training the metric_loss first and then fixing the weights to continue training? |
Hi @bigpon Lines 106 to 118 in 5ec3ab9
I suspect that perhaps the denoising process such as adv_train_max_steps or adv_batch_length doesn't require adversarial parameters, because I didn't find them in the configuration file like config/denoise/symAD_vctk_48000_hop300.yaml .AudioDec/config/denoise/symAD_vctk_48000_hop300.yaml Lines 174 to 180 in 5ec3ab9
|
Hi, |
Lines 44 to 54 in 9cc4e58
AudioDec/config/denoise/symAD_vctk_48000_hop300.yaml Lines 27 to 29 in 9cc4e58
Lines 239 to 255 in 9cc4e58
I executed stage 0 according to submit_denoise.sh . However, I found that in the configuration file the file exp/autoencoder/symAD_vctk_48000_hop300/checkpoint-200000steps.pkl will be loaded as initial during stage 0 . Do I need to train this file out in advance (for the new dataset)?
|
Hi @bigpon
|
Hi, in the first step, you have to train the decoder for another 500k iteration with GAN. In the final step, you should take the decoder from the one trained with GAN. |
Hi @bigpon |
Because of the phase misaligned issue (you can check our paper ScoreDec), AudioDec usually achieves low PESQ even when the input is clean speech. Using multi-resolution mel-loss can improve the PESQ but it still cannot achieve a very high PESQ score. For perceptual quality, although the PESQ score is low, the quality should be OK. However, since it is just a simple approach to update only the encoder, it only achieves an OK performance, which still falls behind the SOTA speech enhancement methods. |
Hi @bigpon |
Hi,
|
Hi @bigpon |
Hi, in this case, we want the postfilter to do two things.
Therefore the target speech is the clean speech without any process (i.e. the ground truth). I have tried to use only I or I+II to train the postfilter. Therefore, I suggest you prepare both (Type-I, clean_speech) and (Type-II, clean_speech) pairs to train the postfilter. |
|
Hi @bigpon |
Hi @bigpon |
Hi @bigpon |
@bigpon Hi
I'm trying to reproduce the denoising code.
https://github.com/facebookresearch/AudioDec?tab=readme-ov-file#bonus-track-denoising
You mentioned following the requirements in
submit_denoise.sh
in this paragraph "Prepare the noisy-clean corpus and follow the usage instructions in submit_denoise.sh to run the training and testing", but the execution code below issubmit_autoencoder.sh
. May I ask what should be done?The text was updated successfully, but these errors were encountered: