Skip to content
/ MTDA Public
forked from Visitor-W/MTDA

MTDA-HSED: Mutual-Assistance Tuning and Dual-Branch Aggregating for Heterogeneous Sound Event Detection

Notifications You must be signed in to change notification settings

Harper812/MTDA

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

MTDA-HSED arxiv

The official implementation of MTDA-HSED: Mutual-Assistance Tuning and Dual-Branch Aggregating for Heterogeneous Sound Event Detection. (Submiited to ICASSP 2025)
Authors: Zehao Wang, Haobo Yue, Zhicheng Zhang, Da Mu, Jin Tang,Jianqin Yin

Issues 😊 | Lab 👏 | Contact 📫

Updating

Code will be released soon!

Introduction

Visualization of the M3A Module

By comparing the feature maps of the long-term, short-term audio adapter with the spectrograms of the input data, we can see that most of the time-frequency patterns modeled by short-term audio adapter are temporally isolated and disjoint. In contrast, the long-term audio adapter's patterns and their neighbors are in a whole, thereby forming a special time-frequency representation.

Visualization of the DBMF Module

We compare the aggregated feature map using the DBMF module with the feature map of the baseline, we can see that the feature map of the baseline is blurred, as shown in the second row of the figure. After aggregating the local and global feature, the information within the DBMF feature map is more prominent, as shown in the third row of the figure.

Performance

MTDA-HSED is evaluated on DESED and Mestro

Model PSDS1 $\uparrow$ PSDS1(sed score) $\uparrow$ mpAUC $\uparrow$
Baseline 0.494 0.499 0.709
ATST-SED 0.297 0.301 0.554
MONA 0.497 0.507 0.709
ADAPTER 0.494 0.503 0.704
ACT-NET 0.308 0.316 0.696
M3A(ours) 0.503 0.511 0.753
DBMF(ours) 0.494 0.501 0.748
MTDA-HSED(ours) 0.503 0.514 0.757

Reference

Citation

If this repository helped your works, please cite papers below! 😘

@article{wang2024mtda,
  title={MTDA-HSED: Mutual-Assistance Tuning and Dual-Branch Aggregating for Heterogeneous Sound Event Detection},
  author={Wang, Zehao and Yue, Haobo and Zhang, Zhicheng and Mu, Da and Tang, Jin and Yin, Jianqin},
  journal={arXiv preprint arXiv:2409.06196},
  year={2024}
}

About

MTDA-HSED: Mutual-Assistance Tuning and Dual-Branch Aggregating for Heterogeneous Sound Event Detection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published