Representing Sounds as Neural Amplitude Fields: A Benchmark of Coordinate-MLPs and A Fourier Kolmogorov-Arnold Framework
[AAAI 2025]
Linfei Li · Lin Zhang* · Zhong Wang · Fengyi Zhang · Zelin Li · Ying Shen
-
Configure a Python environment and install related dependencies.
pip install -r requirements.txt -
Download the required dataset from the following websites.
-
Organize the data set according to the following file structure.
--data --demo --gt_bach.wav --gt_counting.wav --gt_blues00000.wav # from GTZAN dataset blues_00000.wav --gtzan --genres --blues ... --VCTK --wav48_silence_trimmed --p231 ...
- Testing on
Bach,Counting, andBlues.
bash scripts/benchmark_MLPs_demo.sh
- Testing on
CSTR VCTKdataset.
bash scripts/benchmark_MLPs_vctk.sh
- Testing on
GTZANdataset.
bash scripts/benchmark_MLPs_gtzan.sh
- Testing on
Bach,Counting, andBlues.
bash scripts/benchmark_KANs_demo.sh
- Testing on
CSTR VCTKdataset.
bash scripts/benchmark_KANs_vctk.sh
- Testing on
GTZANdataset.
bash scripts/benchmark_KANs_gtzan.sh
-
RFFpositional encoding is sensitive to the dimension parameter$L$ .
bash scripts/benchmark_FFN_L.sh
-
RFFpositional encoding is sensitive to the variance parameter$\sigma$ .
bash scripts/benchmark_FFN_sigma.sh
-
NeFFpositional encoding is sensitive to the dimension parameter$L$ .
bash scripts/benchmark_NeRF_L.sh
-
Gaussian-typeactivation functions are sensitive to the variance factor$a$ .
bash scripts/benchmark_gaussian.sh
-
Sine-typeactivation functions are sensitive to the frequency factor$\omega$ .
# Sine
bash scripts/benchmark_siren.sh
# Incode-Sine
bash scripts/benchmark_incode-sine.sh
bash scripts/benchmark_sensitive_init.sh
When model capacity is limited, larger
bash scripts/benchmark_Fourier_omega.sh
@article{Li_2025,
title={Representing Sounds as Neural Amplitude Fields: A Benchmark of Coordinate-MLPs and a Fourier Kolmogorov-Arnold Framework},
volume={39},
number={23},
journal={Proceedings of the AAAI Conference on Artificial Intelligence},
publisher={AAAI},
author={Li, Linfei and Zhang, Lin and Wang, Zhong and Zhang, Fengyi and Li, Zelin and Shen, Ying},
year={2025},
month=apr, pages={24458–24466} }