A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement

Performance

Model performance on private dataset, for reference only

Experiment		5dB			10dB		Average	Comment
	0.5m	1m	2m	0.5m	1m	2m		Distance to the microphone
Mixture	2.33	2.17	1.85	2.44	2.27	1.94	2.167
LSTM	2.62	2.49	2.02	2.71	2.53	2.13	2.417
Our implementation	2.630	2.458	2.086	2.729	2.527	2.172	2.434
Our implementation (LN)	2.703	2.461	1.961	2.796	2.548	2.181	2.442	Replace all batch norm with layer norm

Python3
torch==1.1.0
librosa==0.7.0
SoundFile==0.10.2
tensorboard==1.14.0
tensorboard==1.13.1(for visualization only)
pypesq==1.0, pip install https://github.com/vBaiCai/python-pesq/archive/master.zip
pystoi==0.2.2
matplotlib==3.1.0
tqdm==4.32.2

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
config		config
dataset		dataset
model		model
trainer		trainer
utils		utils
.gitignore		.gitignore
README.md		README.md
enhancement.py		enhancement.py
train.py		train.py