Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch
If you have any questions, you can ask them through the issue.
If you find this project helpful, you can give me a star generously.
Demo Pages: Results of pure speech separation model
-
2020-02-01: Reading article “Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation”. Zhihu Article link "阅读笔记”Dual-path RNN for Speech Separation“". Blog Article link "阅读笔记《Dual-path RNN for speech separation》". Both articles are interpretations of the paper. If you have any questions, welcome to discuss with me
-
2020-02-02: Complete data preprocessing, data set code. Dataset Code: /data_loader/Dataset.py
-
2020-02-03: Complete Conv-TasNet Framework (Update /model/model.py, Trainer_Tasnet.py, Train_Tasnet.py)
-
2020-02-07: Complete Training code. (Update /model/model_rnn.py) and Test parameters and some details are being adjusted.
-
2020-02-08: Fixed the code's bug.
-
2020-02-11: Complete Testing code.
We used the WSJ0 dataset as our training, test, and validation sets. Below is the data download link and mixed audio code for WSJ0.
- First, you need to generate the scp file using the following command. The content of the scp file is "filename && path".
python create_scp.py
- Then you can modify the training and model parameters through "config/Conv_Tasnet/train.yml".
cd config/Conv-Tasnet
vim train.yml
- Then use the following command in the root directory to train the model.
python train_Tasnet.py --opt config/Conv_Tasnet/train.yml
- First, you need to generate the scp file using the following command. The content of the scp file is "filename && path".
python create_scp.py
- Then you can modify the training and model parameters through "config/Dual_RNN/train.yml".
cd config/Dual_RNN
vim train.yml
- Then use the following command in the root directory to train the model.
python train_rnn.py --opt config/Dual_RNN/train.yml
You need to modify the default parameters in the test_tasnet.py file, including test files, test models, etc.
python test_tasnet.py
python test_tasnet_wav.py
You need to modify the default parameters in the test_dualrnn.py file, including test files, test models, etc.
python test_dualrnn.py
python test_dualrnn_wav.py
Final Results: 15.8690 is 0.56 higher than 15.3 in the paper.
Final Results: 18.98 is 0.1 higher than 18.8 in the paper.
- Luo Y, Chen Z, Yoshioka T. Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation[J]. arXiv preprint arXiv:1910.06379, 2019.
- Conv-TasNet code && Dual-RNN code