Reading list for research topics in Sound AI
-
Updated
Aug 8, 2024
Reading list for research topics in Sound AI
Web-crawl for "Audio Retrieval with WavText5K and CLAP Training"
Implementation of "Audio Retrieval with Natural Language Queries: A Benchmark Study".
Tracking states of the arts and recent results (bibliography) on sound tasks.
Implementation of "Audio Retrieval with Natural Language Queries", INTERSPEECH 2021, PyTorch
This is the official codebase used for obtaining the results in the ICASSP 2024 paper: A SOUND APPROACH: Using Large Language Models to generate audio descriptions for egocentric text-audio retrieval
During the project for the DIGITAL SIGNAL IMAGE MANAGEMENT course I learned how to manage and process audio and image files. The aim of the project was the classification, through machine learning and deep learning models, of musical genres by extracting specific audio features from the "gtzan dataset" dataset files with which to train the model…
This repository provides the code for "Improving Query-by-Vocal Imitation with Contrastive Learning and Audio Pretraining", presented at DCASE 2024. The paper addresses the challenge of audio retrieval using vocal imitations as queries, proposing a dual encoder architecture that leverages pretrained CNNs and an adapted NT-Xent loss for fine-tuning.
Add a description, image, and links to the audio-retrieval topic page so that developers can more easily learn about it.
To associate your repository with the audio-retrieval topic, visit your repo's landing page and select "manage topics."