Skip to content

The repo for "Incorporating brain-inspired mechanisms for multimodal learning in artificial intelligence"

Notifications You must be signed in to change notification settings

Brain-Cog-Lab/IEMF

Repository files navigation

Incorporating brain-inspired mechanisms for multimodal learning in artificial intelligence

Institute of Automation, Chinese Academy of Sciences, Beijing
*Equal contribution †Corresponding author

[arxiv] [paper] [code]


Here is the PyTorch implementation of our paper. If you find this work useful for your research, please kindly cite our paper and star our repo.

Method

We propose an inverse effectiveness driven multimodal fusion (IEMF) method, which dynamically adjusts the update dynamics of the multimodal fusion module based on the relationship between the strength of individual modality cues and the strength of the fused multimodal signal.

Usage

+--- Audio Visual Classification 
+--- Audio Visual Continual Learning
\--- Audio Visual Question Answering

Three folders provide three tasks each. They contain detailed run scripts for each task, drawing programs, and the way to download the corresponding dataset.

Well-trained model

We also upload the weights of the trained model, as well as the log files from the training process here to ensure reproduction of the results in the paper. You can find them at https://huggingface.co/xianghe/IEMF/tree/main.

Dataset Download

You can find how to download the dataset under the folder corresponding to each task. In particular, due to the processing complexity of the Kinetics-Sounds dataset, you can download our packaged raw video-audio dataset at here (extraction code: bauh). In addition to the original dataset, we also provide processed data in HDF5 format ready for network model input, which you can access here (extraction code: jzbg).

Citation

If our paper is useful for your research, please consider citing it:

@misc{he2025incorporatingbraininspiredmechanismsmultimodal,
      title={Incorporating brain-inspired mechanisms for multimodal learning in artificial intelligence}, 
      author={Xiang He and Dongcheng Zhao and Yang Li and Qingqun Kong and Xin Yang and Yi Zeng},
      year={2025},
      eprint={2505.10176},
      archivePrefix={arXiv},
      primaryClass={cs.NE},
      url={https://arxiv.org/abs/2505.10176}, 
}

Acknowledgements

The code for each of the three tasks refers to OGM_GE, AV-CIL_ICCV2023, MUSIC_AVQA:, thanks for their excellent work!

If you are confused about using it or have other feedback and comments, please feel free to contact us via hexiang2021@ia.ac.cn. Have a good day!

About

The repo for "Incorporating brain-inspired mechanisms for multimodal learning in artificial intelligence"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published