Skip to content

[AAAI 2025] Representing Sounds as Neural Amplitude Fields: A Benchmark of Coordinate-MLPs and A Fourier Kolmogorov-Arnold Framework

License

Notifications You must be signed in to change notification settings

lif314/Fourier-ASR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Representing Sounds as Neural Amplitude Fields: A Benchmark of Coordinate-MLPs and A Fourier Kolmogorov-Arnold Framework
[AAAI 2025]

Linfei Li · Lin Zhang* · Zhong Wang · Fengyi Zhang · Zelin Li · Ying Shen

NeAF teaser

What is Neural Amplitude Fields?

NeAF

Environment Configuration

Benchmark of Coordinate-MLPs in Audio Signal Representations

Overview

benchmark

Run

  • Testing on Bach, Counting, and Blues.
bash scripts/benchmark_MLPs_demo.sh
  • Testing on CSTR VCTK dataset.
bash scripts/benchmark_MLPs_vctk.sh
  • Testing on GTZAN dataset.
bash scripts/benchmark_MLPs_gtzan.sh

Fourier-ASR: A Fourier Kolmogorov-Arnold Framework

Overview

fourier

Run

  • Testing on Bach, Counting, and Blues.
bash scripts/benchmark_KANs_demo.sh
  • Testing on CSTR VCTK dataset.
bash scripts/benchmark_KANs_vctk.sh
  • Testing on GTZAN dataset.
bash scripts/benchmark_KANs_gtzan.sh

Ablation Experiments

Positional encoding is parameter-sensitive

  • RFF positional encoding is sensitive to the dimension parameter $L$.
bash scripts/benchmark_FFN_L.sh
  • RFF positional encoding is sensitive to the variance parameter $\sigma$.
bash scripts/benchmark_FFN_sigma.sh
  • NeFF positional encoding is sensitive to the dimension parameter $L$.
bash scripts/benchmark_NeRF_L.sh

Activation functions are parameter-sensitive

  • Gaussian-type activation functions are sensitive to the variance factor $a$.
bash scripts/benchmark_gaussian.sh
  • Sine-type activation functions are sensitive to the frequency factor $\omega$.
# Sine
bash scripts/benchmark_siren.sh

# Incode-Sine
bash scripts/benchmark_incode-sine.sh

Periodic activation functions are sensitive to initialization schemes

bash scripts/benchmark_sensitive_init.sh

Fourier-ASR is parameter-insensitive

When model capacity is limited, larger $\Omega$ in the input layer improve the performance.

bash scripts/benchmark_Fourier_omega.sh

Citation

@article{Li_2025,
   title={Representing Sounds as Neural Amplitude Fields: A Benchmark of Coordinate-MLPs and a Fourier Kolmogorov-Arnold Framework},
   volume={39},
   number={23},
   journal={Proceedings of the AAAI Conference on Artificial Intelligence},
   publisher={AAAI},
   author={Li, Linfei and Zhang, Lin and Wang, Zhong and Zhang, Fengyi and Li, Zelin and Shen, Ying},
   year={2025},
   month=apr, pages={24458–24466} }

About

[AAAI 2025] Representing Sounds as Neural Amplitude Fields: A Benchmark of Coordinate-MLPs and A Fourier Kolmogorov-Arnold Framework

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published