Skip to content

seongq/flowmse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FlowSE: Flow matching based speech enhancement

This repository contains the official PyTorch implementations for the 2025 paper:

  • FlowSE: Flow Matching-based Speech Enhancement [1]

FlowSE fig1

YouTube English Video

Presentation video [english], Presentation video [korean]

Speech examples are available on our [DEMOpage](https://seongqjini.com/speech-enhancement-with-flow-matching-method/).

This repository builds upon previous great works:

Follow-up work

Please also check out our follow-up work with code available:

  • Seonggyu Lee, Sein Cheong, Sangwook Han, Kihyuk Kim and Jong Won Shin, “Speech Enhancement based on cascaded two flows” in Proceedings of Interspeech, Aug. 2025 (accepted). [github]

Installation

  • Create a new virtual environment with Python 3.10 (we have not tested other Python versions, but they may work).
  • Install the package dependencies via pip install -r requirements.txt.
  • W&B is required.

Training

Training is done by executing train.py. A minimal running example with default settings (as in our paper [1]) can be run with

python train.py --base_dir <your_dataset_dir>

where your_dataset_dir should be a containing subdirectories train/ and valid/ (optionally test/ as well).

Each subdirectory must itself have two subdirectories clean/ and noisy/, with the same filenames present in both. We currently only support training with .wav files.

To get the training set WSJ0-CHIME3, we refer to https://github.com/sp-uhh/sgmse and execute create_wsj0_chime3.py.

To see all available training options, run python train.py --help.

Checkpoints

Evaluation

To evaluate on a test set, run

python evaluate.py --test_dir <your_test_dataset_dir> --folder_destination <your_enh_result_save_dir> --ckpt <path_to_model_checkpoint> --N <num_of_time_steps>

your_test_dataset_dir should contain a subfolder test which contains subdirectories clean and noisy. clean and noisy should contain .wav files.

Citations / References

[1] Seonggyu Lee, Sein Cheong, Sangwook Han, Jong Won Shin. FlowSE: Flow Matching-based Speech Enhancement, ICASSP, 2025.

@INPROCEEDINGS{10888274,
  author={Seonggyu Lee and Sein Cheong and Sangwook Han and Jong Won Shin},
  booktitle={ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, 
  title={FlowSE: Flow Matching-based Speech Enhancement}, 
  year={2025},
  doi={10.1109/ICASSP49660.2025.10888274}}

About

(ICASSP 2025, official code)FlowSE: Flow Matching-based Speech Enhancement

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published