Skip to content

Commit

Permalink
Merge pull request WenzheLiu-Speech#5 from hit-thusz-RookieCJ/master
Browse files Browse the repository at this point in the history
Update
  • Loading branch information
WenzheLiu-Speech authored Feb 11, 2022
2 parents b830c54 + 5043c6b commit b206744
Showing 1 changed file with 31 additions and 17 deletions.
48 changes: 31 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,28 +16,37 @@ https://github.com/topics/beamforming
- [Tools](#Tools)
- [Resources](#Resources)


## Speech_Enhancement

### Magnitude spectrogram
#### spectral masking
* 2014, On Training Targets for Supervised Speech Separation, Wang. [[Paper]](https://ieeexplore.ieee.org/document/6887314)
* 2018, A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement, [Valin](https://github.com/jmvalin). [[Paper]](https://ieeexplore.ieee.org/document/8547084/) [[RNNoise]](https://github.com/xiph/rnnoise) [[RNNoise16k]](https://github.com/YongyuG/rnnoise_16k)
* 2020, A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech, [Valin](https://github.com/jmvalin). [Paper](https://arxiv.org/abs/2008.04259) [[PercepNet]](https://github.com/jzi040941/PercepNet)
* 2020, Online Monaural Speech Enhancement using Delayed Subband LSTM, Li. [[Paper]](https://arxiv.org/abs/2005.05037)
* 2020, FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement, [Hao](https://github.com/haoxiangsnr). [[Paper]](https://arxiv.org/pdf/2010.15508.pdf) [[FullSubNet]](https://github.com/haoxiangsnr/FullSubNet)
* 2021, RNNoise-Ex: Hybrid Speech Enhancement System based on RNN and Spectral Features. [[Paper]](https://arxiv.org/abs/2105.11813) [[RNNoise-Ex]](https://github.com/CedArctic/rnnoise-ex)
* Other IRM-based SE repositories: [[IRM-SE-LSTM]](https://github.com/haoxiangsnr/IRM-based-Speech-Enhancement-using-LSTM) [[nn-irm]](https://github.com/zhaoforever/nn-irm) [[rnn-se]](https://github.com/amaas/rnn-speech-denoising) [[DL4SE]](https://github.com/miralv/Deep-Learning-for-Speech-Enhancement)

#### spectral mapping
* 2014, An Experimental Study on Speech Enhancement Based on Deep Neural Networks, [Xu](https://github.com/yongxuUSTC). [[Paper]](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6665000)

* 2014, A Regression Approach to Speech Enhancement Based on Deep Neural Networks, [Xu](https://github.com/yongxuUSTC). [[Paper]](https://ieeexplore.ieee.org/document/6932438) [[sednn]](https://github.com/yongxuUSTC/sednn) [[DNN-SE-Xu]](https://github.com/yongxuUSTC/DNN-Speech-enhancement-demo-tool) [[DNN-SE-Li]](https://github.com/hyli666/DNN-SpeechEnhancement)

* Other DNN magnitude spectrum mapping-based SE repositories: [[SE toolkit]](https://github.com/jtkim-kaist/Speech-enhancement) [[TensorFlow-SE]](https://github.com/linan2/TensorFlow-speech-enhancement-Chinese) [[UNetSE]](https://github.com/vbelz/Speech-enhancement)

* 2015, Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR, Weninger. [[Paper]](https://hal.inria.fr/hal-01163493/file/weninger_LVA15.pdf)

* 2016, A Fully Convolutional Neural Network for Speech Enhancement, Park. [[Paper]](https://arxiv.org/abs/1609.07132) [[CNN4SE]](https://github.com/dtx525942103/CNN-for-single-channel-speech-enhancement)

* 2017, Long short-term memory for speaker generalizationin supervised speech separation, Chen. [[Paper]](http://web.cse.ohio-state.edu/~wang.77/papers/Chen-Wang.jasa17.pdf)

* 2018, A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement, [Tan](https://github.com/JupiterEthan). [[Paper]](https://web.cse.ohio-state.edu/~wang.77/papers/Tan-Wang1.interspeech18.pdf) [[CRN-Tan]](https://github.com/JupiterEthan/CRN-causal)

* 2018, Convolutional-Recurrent Neural Networks for Speech Enhancement, Zhao. [[Paper]](https://arxiv.org/pdf/1805.00579.pdf) [[CRN-Hao]](https://github.com/haoxiangsnr/A-Convolutional-Recurrent-Neural-Network-for-Real-Time-Speech-Enhancement)
* 2020, Online Monaural Speech Enhancement using Delayed Subband LSTM, Li. [[Paper]](https://isca-speech.org/archive/Interspeech_2020/pdfs/2091.pdf)
* 2020, FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement, [Hao](https://github.com/haoxiangsnr). [[Paper]](https://arxiv.org/pdf/2010.15508.pdf) [[FullSubNet]](https://github.com/haoxiangsnr/FullSubNet)


### Complex domain
* 2017, Complex spectrogram enhancement by convolutional neural network with multi-metrics learning, [Fu](https://github.com/JasonSWFu). [[Paper]](https://arxiv.org/pdf/1704.08504.pdf)
* 2017, Time-Frequency Masking in the Complex Domain for Speech Dereverberation and Denoising, Williamson. [[Paper]](https://ieeexplore.ieee.org/abstract/document/7906509)
Expand Down Expand Up @@ -120,30 +129,35 @@ https://github.com/topics/beamforming
* Tutorial speech separation, like awesome series [[Link]](https://github.com/gemengtju/Tutorial_Separation)
### NN-based separation
* 2015, Deep-Clustering:Discriminative embeddings for segmentation and separation, Hershey and Chen.[[Paper]](https://arxiv.org/abs/1508.04306)
[[Code]](https://github.com/JusperLee/Deep-Clustering-for-Speech-Separation)
[[Code]](https://github.com/simonsuthers/Speech-Separation)
[[Code]](https://github.com/funcwj/deep-clustering)
[[Code]](https://github.com/JusperLee/Deep-Clustering-for-Speech-Separation)
[[Code]](https://github.com/simonsuthers/Speech-Separation)
[[Code]](https://github.com/funcwj/deep-clustering)
* 2016, DANet:Deep Attractor Network (DANet) for single-channel speech separation, Chen.[[Paper]](https://arxiv.org/abs/1611.08930)
[[Code]](https://github.com/naplab/DANet)
[[Code]](https://github.com/naplab/DANet)
* 2017, Multitalker speech separation with utterance-level permutation invariant training of deep recurrent, Yu.[[Paper]](https://ai.tencent.com/ailab/media/publications/Multi-talker_Speech_Separation_with_Utterance-level.pdf)
[[Code]](https://github.com/funcwj/uPIT-for-speech-separation)
[[Code]](https://github.com/funcwj/uPIT-for-speech-separation)
* 2018, LSTM_PIT_Speech_Separation
[[Code]](https://github.com/pchao6/LSTM_PIT_Speech_Separation)
[[Code]](https://github.com/pchao6/LSTM_PIT_Speech_Separation)
* 2018, Tasnet: time-domain audio separation network for real-time, single-channel speech separation, Luo.[[Paper]](https://arxiv.org/abs/1711.00541v2)
[[Code]](https://github.com/mpariente/asteroid/blob/master/egs/whamr/TasNet)
[[Code]](https://github.com/mpariente/asteroid/blob/master/egs/whamr/TasNet)
* 2019, Conv-TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation, Luo.[(Paper)](https://arxiv.org/pdf/1809.07454.pdf)
[[Code]](https://github.com/kaituoxu/Conv-TasNet)
[[Code]](https://github.com/kaituoxu/Conv-TasNet)
* 2019, Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation, Luo.[[Paper]](https://arxiv.org/abs/1910.06379v1)
[[Code1]](https://github.com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation)
[[Code2]](https://github.com/JusperLee/Dual-Path-RNN-Pytorch)
[[Code1]](https://github.com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation)
[[Code2]](https://github.com/JusperLee/Dual-Path-RNN-Pytorch)
* 2019, TAC end-to-end microphone permutation and number invariant multi-channel speech separation, Luo.[[Paper]](https://arxiv.org/abs/1910.14104)
[[Code]](https://github.com/yluo42/TAC)
[[Code]](https://github.com/yluo42/TAC)
* 2020, Continuous Speech Separation with Conformer, Chen.[[Paper]](https://arxiv.org/abs/2008.05773) [[Code]](https://github.com/Sanyuan-Chen/CSS_with_Conformer)
* 2020, Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation, Chen.[[Paper]](https://arxiv.org/abs/2007.13975) [[Code]](https://github.com/ujscjj/DPTNet)
* 2020, Wavesplit: End-to-End Speech Separation by Speaker Clustering, Zeghidour.[[Paper]](https://arxiv.org/abs/2002.08933)
* 2021, Attention is All You Need in Speech Separation, Subakan.[[Paper]](https://arxiv.org/abs/2010.13154) [[Code]](https://github.com/speechbrain/speechbrain/tree/develop/recipes/WSJ0Mix/separation)
* 2021, Ultra Fast Speech Separation Model with Teacher Student Learning, Chen.[[Paper]](https://www.isca-speech.org/archive/pdfs/interspeech_2021/chen21l_interspeech.pdf)
* sound separation(Google) [[Code]](https://github.com/google-research/sound-separation)
* sound separation: Deep learning based speech source separation using Pytorch [[Code]](https://github.com/AppleHolic/source_separation)
* music-source-separation
[[Code]](https://github.com/andabi/music-source-separation)
[[Code]](https://github.com/andabi/music-source-separation)
* Singing-Voice-Separation
[[Code]](https://github.com/Jeongseungwoo/Singing-Voice-Separation)
[[Code]](https://github.com/Jeongseungwoo/Singing-Voice-Separation)
* Comparison-of-Blind-Source-Separation-techniques[[Code]](https://github.com/TUIlmenauAMS/Comparison-of-Blind-Source-Separation-techniques)
### BSS/ICA method
* FastICA[[Code]](https://github.com/ShubhamAgarwal1616/FastICA)
Expand Down

0 comments on commit b206744

Please sign in to comment.