Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion Error t len(mixture) == len(clean) == len(enhanced) #7

Open
mnabihali opened this issue May 28, 2020 · 30 comments
Open

Assertion Error t len(mixture) == len(clean) == len(enhanced) #7

mnabihali opened this issue May 28, 2020 · 30 comments

Comments

@mnabihali
Copy link

when I am trying to run the code it gives me an error in this condition
assert len(mixture) == len(clean) == len(enhanced)
I printed the len of each and found the len of enhanced and clean is equal but len of the mixture is greater than both.

I hope you can help me as soon as possible

@Lerry123
Copy link

Hello, have you solved your problem?I have the same problem with you.

@mnabihali
Copy link
Author

mnabihali commented Jul 17, 2020 via email

@Lerry123
Copy link

Are you Chinese? We can chat on qq. My English is not good.

@Lerry123
Copy link

I find this code do the padding in the mixture,but the clean and enhancement don't do the padding.
Which database do you use and how about the result?

@mnabihali
Copy link
Author

mnabihali commented Jul 18, 2020 via email

@mnabihali
Copy link
Author

mnabihali commented Jul 18, 2020 via email

@Lerry123
Copy link

I am also using the VCTK database,I haven't found a solution yet.If I find the solution,I will tell you.

@Lerry123
Copy link

I have padded for the clean、enhanced and the mixture and this problem is solved,but I have the new problem. When computed STOI, it 's error.The detail is as follow.
AttributeError: module 'numpy' has no attribute 'gcd'
I haven't found the solution and I only computed the PESQ.
# Metric
#stoi_c_n.append(compute_STOI(clean, mixture, sr=16000))
#stoi_c_e.append(compute_STOI(clean, enhanced, sr=16000))
pesq_c_n.append(compute_PESQ(clean, mixture, sr=16000))
pesq_c_e.append(compute_PESQ(clean, enhanced, sr=16000))
Did you have the same problem?

@mnabihali
Copy link
Author

mnabihali commented Jul 20, 2020 via email

@mnabihali
Copy link
Author

@Lerry123
Can you tell me how you pad the signals, and I can try to solve the Numpy issue.

Thanks

@Lerry123
Copy link

trainer.py:

for i, (mixture, clean, name) in enumerate(self.validation_data_loader):
assert len(name) == 1, "Only support batch size is 1 in enhancement stage."
name = name[0]
padded_length = 0
#print("len(mixture):",len( mixture.cpu().numpy().reshape(-1)))
mixture = mixture.to(self.device) # [1, 1, T]
clean = clean.to(self.device)
# The input of the model should be fixed length.

        if mixture.size(-1) % sample_length != 0:
            #print("mixture.size(-1):",mixture.size(-1))
            padded_length = sample_length - (mixture.size(-1) % sample_length)
            #print("padded_length:",padded_length)    
            mixture = torch.cat([mixture, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1)
            clean = torch.cat([clean, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1)
        #print("len(mixture):",len( mixture.cpu().numpy().reshape(-1)))
        #print("mixture.size(-1) % sample_length:",mixture.size(-1) % sample_length)
        #print("mixture.dim():",mixture.dim())
        assert mixture.size(-1) % sample_length == 0 and mixture.dim() == 3
        mixture_chunks = list(torch.split(mixture, sample_length, dim=-1))
        #print("mixture_chunks:",mixture_chunks)    
        enhanced_chunks = []
        for chunk in mixture_chunks:
            enhanced_chunks.append(self.model(chunk).detach().cpu())
        enhanced = torch.cat(enhanced_chunks, dim=-1)  # [1, 1, T]
        enhanced = enhanced.to(self.device)
        '''
        print("len(enhanced):",len(enhanced.cpu().reshape(-1).numpy()))
        print("padded_length:",padded_length)
        '''
        enhanced = enhanced 
        if padded_length == 0:
            enhanced = enhanced 
        else:
            
            enhanced = torch.cat([enhanced, torch.zeros(1, 1, padded_length, device=self.device)], dim=-1)
            enhanced=enhanced[:, :, :-padded_length]
        #print("len(enhanced):",len(enhanced.cpu().reshape(-1).numpy())) 
        enhanced = enhanced.cpu().reshape(-1).numpy()
        clean = clean.cpu().numpy().reshape(-1)   
        mixture = mixture.cpu().numpy().reshape(-1)

@mnabihali
Copy link
Author

mnabihali commented Jul 21, 2020 via email

@Lerry123
Copy link

Thank you! It was solved by your suggestion.

@mnabihali
Copy link
Author

mnabihali commented Jul 22, 2020 via email

@Lerry123
Copy link

Can I get your email ? Have you done the test? My result is very confusing.

@diff7
Copy link

diff7 commented Aug 29, 2020

I think the problem is here:

enhanced = enhanced if padded_length == 0 else enhanced[:, :, :-padded_length]

should be:

if padded_length != 0:                                                            
     enhanced = enhanced[:,:,:-padded_length]         
     mixture = mixture[:,:,:-padded_length] ```

@Lerry123
Copy link

That‘s the problem,as you said. When I use VCTK database,the test result is so bad and the speech is distorted.

@diff7
Copy link

diff7 commented Sep 9, 2020

@Lerry123 did you try on the same dataset?
And what are advantages of training on VTCK?

I am training on the same dataset, 500 epochs so far and the quality is not great. PESQ is quite low, 1.75, STIO is 0.85 and yes, the sound is distorted but I will tweak some parameters, let's see if it gives a boost.

@Lerry123
Copy link

I have set the sr=16000 in waveform_dataset.py and waveform_dataset_enhancement.py, the PESQ is 2.63.The result is best.

waveform_dataset.py:

line65: mixture, _ = librosa.load(os.path.abspath(os.path.expanduser(mixture_path)), sr=16000)
line 66: clean, _ = librosa.load(os.path.abspath(os.path.expanduser(clean_path)), sr=16000)

@diff7
Copy link

diff7 commented Sep 11, 2020

@Lerry123 got it, thanks!
PESQ = 2.63, Is it with VTCK?

@Lerry123
Copy link

Yes

@meisanhai
Copy link

meisanhai commented Jan 1, 2022

请问你是怎么更改的参数呀,我用原论文里面与SEGAN相同的数据集,训练出来的结果声音严重失真啊,呜呜呜,怎么回事能帮忙解答一下吗,感谢@Lerry123@diff7

@mnabihali
Copy link
Author

mnabihali commented Jan 1, 2022 via email

@mnabihali
Copy link
Author

mnabihali commented Jan 1, 2022 via email

@mnabihali
Copy link
Author

mnabihali commented Jan 1, 2022 via email

@meisanhai
Copy link

mnabihali
Thankyou verymuch!我把waveform_dataset.py里面的采样率改为16K,效果变好了(:-|

@VaeFlashMe
Copy link

我已经填充了清洁、增强和混合,这个问题解决了,但我有新问题。计算 STOI 时报错。详情如下。 AttributeError: module 'numpy' has no attribute 'gcd' 我没有找到解决方案,我只计算了 PESQ。 # Metric #stoi_c_n.append(compute_STOI(clean, mix, sr=16000)) #stoi_c_e.append(compute_STOI(clean, enhanced, sr=16000)) pesq_c_n.append(compute_PESQ(clean, mix, sr=16000)) pesq_c_e .append(compute_PESQ(clean, enhanced, sr=16000)) 你有同样的问题吗?

您好,请问一下是你怎么解决第十个epcho报错的问题的,我搞了好久没有解决

@andyye1999
Copy link

I think the problem is here:

enhanced = enhanced if padded_length == 0 else enhanced[:, :, :-padded_length]

should be:

if padded_length != 0:                                                            
     enhanced = enhanced[:,:,:-padded_length]         
     mixture = mixture[:,:,:-padded_length] ```

thank you so much

@renxuezhang
Copy link

你是中国人吗?我们可以在qq上聊天。我的英语不好。

您好,我现在研二,想复现这个代码做一个创新点。复现中遇到了一些问题,请问您方便帮我看一下吗?可以加个qq交流下吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants