This repository contains the official PyTorch implementations for the 2025 paper:
- FlowSE: Flow Matching-based Speech Enhancement [1]
Presentation video [english], Presentation video [korean]
Speech examples are available on our [DEMOpage](https://seongqjini.com/speech-enhancement-with-flow-matching-method/).This repository builds upon previous great works:
- [SGMSE] https://github.com/sp-uhh/sgmse
- [SGMSE-CRP] https://github.com/sp-uhh/sgmse_crp
- [BBED] https://github.com/sp-uhh/sgmse-bbed
Please also check out our follow-up work with code available:
- Seonggyu Lee, Sein Cheong, Sangwook Han, Kihyuk Kim and Jong Won Shin, “Speech Enhancement based on cascaded two flows” in Proceedings of Interspeech, Aug. 2025 (accepted). [github]
- Create a new virtual environment with Python 3.10 (we have not tested other Python versions, but they may work).
- Install the package dependencies via
pip install -r requirements.txt. - W&B is required.
Training is done by executing train.py. A minimal running example with default settings (as in our paper [1]) can be run with
python train.py --base_dir <your_dataset_dir>where your_dataset_dir should be a containing subdirectories train/ and valid/ (optionally test/ as well).
Each subdirectory must itself have two subdirectories clean/ and noisy/, with the same filenames present in both. We currently only support training with .wav files.
To get the training set WSJ0-CHIME3, we refer to https://github.com/sp-uhh/sgmse and execute create_wsj0_chime3.py.
To see all available training options, run python train.py --help.
- wsj0-chime3: Download checkpoint of WSJ0-CHiME3
- voicebank-demand: Download checkpoint of VB-DMD
To evaluate on a test set, run
python evaluate.py --test_dir <your_test_dataset_dir> --folder_destination <your_enh_result_save_dir> --ckpt <path_to_model_checkpoint> --N <num_of_time_steps>your_test_dataset_dir should contain a subfolder test which contains subdirectories clean and noisy. clean and noisy should contain .wav files.
[1] Seonggyu Lee, Sein Cheong, Sangwook Han, Jong Won Shin. FlowSE: Flow Matching-based Speech Enhancement, ICASSP, 2025.
@INPROCEEDINGS{10888274,
author={Seonggyu Lee and Sein Cheong and Sangwook Han and Jong Won Shin},
booktitle={ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
title={FlowSE: Flow Matching-based Speech Enhancement},
year={2025},
doi={10.1109/ICASSP49660.2025.10888274}}