In out paper we analyse four datasets:
- FSL4: This dataset contains user-crontributed loops uploaded to Freesound. It has been built in-house by searching Freesound for sounds with the query terms loop and bpm, and then automatically parsing the returned sound filenames, tags and textual descriptions to identify tempo annotations made by users. For example, a sound containing the tag 120bpm is considered to have a ground truth of 120 BPM.
- APPL: This dataset is composed of the audio loops bundled in Apple's Logic Pro music production software. We parsed the metadata embedded in the audio files using source code available in a public repository, and extracted in this way tempo annotations for all the loops.
- MIXL: This dataset contains all the loops bundled with Acoustica's Mixcraft 7 music production software. Tempo annotations are provided in its loop browser and can be easily exported into a machine-readable format.
- LOOP: This dataset is composed of loops downloaded from Looperman, an online loop sharing community. It was previously used for research purposes in [1]. Tempo annotations are available as metadata provided by the site.
[1] Gerard Roma. Algorithms and representations for supporting online music creation with large-scale audio databases. PhD thesis, Universitat Pompeu Fabra, 2015.
Due to license restrictions, we can only directly provide the contents of FSL4 dataset. We provide two ways of getting the dataset, either by downloading a single zip file or running a script that gets the contents from Freesound. Instructions for both options can be found below.
Two of the other datasets, APPL and MIXL, are in general easy to reproduce if owning the corresponding music production software. You'll also find instructions below on how to recreate these datasets assuming you own the software.
The whole dataset can be downloaded from here.
It contains the audios for all sounds in their original format plus the metadata (including BPM annotations) for each file.
Once downloaded you can use the convert_audio_files_to_wav.py
script to convert all sounds to 44.1 kHz, mono, 16 bit PCM format:
python scripts/convert_audio_files_to_wav.py SOURCE_FOLDER --outdir OUTPUT_FOLDER
This conversion uses ffmpeg
command line utility, therefore you'll need to have it installed in your system.
Alternatively, FSL4 can be recreated running a script that downloads all audio files and metadata directly from Freesound. Be aware that using this script the resulting dataset might not be exactly the same as the one we used in our experiments as there exists the possibility that some files are missing or their descriptions have been altered.
The script is named create_freesound_loops_dataset.py
and can be found in the scripts
folder.
Before running the script you'll have to set the FREESOUND_API_KEY
and FREESOUND_ACCESS_TOKEN
parameters found in the first lines of the script.
To do that you'll need to create a Freesound account and follow these steps:
-
- Apply for Freesound APIv2 credentials accessing this url. The credentials are given instantly upon filling the form.
-
- Fill
FREESOUND_API_KEY
with the alphanumeric string of theClient secret/Api key
column that will be shown after filling the form in step 1.
- Fill
-
- Copy the
Client id
alphanumeric string (shorter thanClient secret/Api key
) that is also given after filling the form.
- Copy the
-
- Go to the following url:
https://www.freesound.org/apiv2/oauth2/authorize/?client_id=YOUR_GIVEN_CLIENT_ID&response_type=code
(replace YOUR_GIVEN_CLIENT_ID
with the contents of Client id
from step 3) and click Authorize!
. Introduce your Freesound account credentials and copy the authorization code that will be given to you.
-
- Run the following command replacing
YOUR_GIVEN_CLIENT_ID
(as in the previous step),YOUR_GIVEN_FREESOUND_APY_KEY
with the contents ofFREESOUND_API_KEY
, andAUTHORIZATION_CODE
with the code you got in step 4.
- Run the following command replacing
curl -X POST -d "client_id=YOUR_GIVEN_CLIENT_ID&client_secret=YOUR_GIVEN_FREESOUND_APY_KEY&grant_type=authorization_code&code=AUTHORIZATION_CODE" https://www.freesound.org/apiv2/oauth2/access_token/
This will return something similar to this:
{"access_token": "e22ce72c0b18bf76a7da9d41ad1d7e146f3af309", "token_type": "Bearer", "expires_in": 86399, "refresh_token": "31cbd0fd394f7381a70b73befd4b97e7a9afdad9", "scope": "read write read+write"}
-
- Fill
FREESOUND_ACCESS_TOKEN
with the contents of the alphanumeric stringaccess_token
of the response.
- Fill
By following this process you'll get an access token that will allow to download freesound audio files in their original format. This access token will be valid for 24 hours. If you run the script when the access token has expired, you'll need to follow again these steps to get a new token.
Once you have filled FREESOUND_API_KEY
and FREESOUND_ACCESS_TOKEN
you can now run the script to recreate the FSL4 dataset.
python scripts/create_freesound_loops_dataset.py --max_sounds 4000 path_where_you_want_to_store_the_dataset
Note that it might take a while to download all the sounds and metadata.
Note also that because of throttling limits of the Freesound API the script might not be able to download all the sounds in one day. By default Freesound API allows 2000 requests per day, meaning that no more than 2000 sounds can be downloaded per day in their original quality. The solutions around that are either to run the script in consecutive days (sounds already downloaded won't be downloaded again) or to ask for more permissive limits to the Freesound API maintainers (see API documentation).
Once sounds are downloaded the script will also convert the audio files to 44.1 kHz, mono, 16 bit PCM format.
This conversion uses ffmpeg
command line utility, therefore you'll need to have it installed in your system.
Although we can not release the other datasets, APPL and MIXL can be rather easily reproduced if owning Apple's Logic Pro and/or Mixcraft 7 music production software music production software. If that is the case, the following instructions should help you getting the dataset ready.
Apple's Logic Pro comes with a number of audio loops bundled with the software. These loops are in .caf
format and typically come with metadata embedded.
To create APPL dataset you can follow these steps:
-
- Install Logic Pro (only available for OSX)
-
- Create a folder for the db in your data directory
-
- Extract the metadata from the
.caf
files using the software hosted in this public repository. Apple loops files are typically located in/Library/Audio/Apple Loops/Apple/
.
- Extract the metadata from the
-
- Save the
.json
file created for each loop in a folder calledmetadata
inside the dataset folder.
- Save the
-
- Run the script
create_apple_loops_dataset.py
that you'll find in thescripts
folder. You'll probably have to tweak some bits of the script to provide the full path to the original audio files.
- Run the script
python scripts/create_apple_loops_dataset.py created_db_folder_in_step_1
The script will take care of creating 44.1 kHz, mono, 16 bit PCM format versions of the files and save them into the audio/wav
folder of the dataset.
After running it, the dataset should be ready for use.
Similarly to Logic Pro, Mixcraft 7 also comes with a number of loops bundled. To create the MIXL dataset you can follow these steps:
-
- Install and open Mixcraft 7 (only available for Windows)
-
- Create an empty project
-
- Click on the
Library
tab at the bottom
- Click on the
-
- Select the
Loops
library and wait untill all sounds are imported/downloaded. When finished you can close Mixcraft.
- Select the
-
- Create a folder for the db in your data directory and inside a folder create
audio/original
subfolders.
- Create a folder for the db in your data directory and inside a folder create
-
- Copy the contents of Mixcraft loops folder in
audio/original
. Mixcraft loops folder should be found inC:\ProgramData\Acoustica\Mixcraft\loops\
.
- Copy the contents of Mixcraft loops folder in
-
- Locate the file
Loops.mldb7
(also in the same Mixcraft loops folder) and copy it to the root db folder. Rename it tooriginal_metadata.csv
.
- Locate the file
-
- Run the script
create_mixcraft_loops_db.py
that you'll find in thescripts
folder.
- Run the script
python scripts/create_mixcraft_loops_db.py created_db_folder_in_step_1
The script will take care of creating 44.1 kHz, mono, 16 bit PCM format versions of the files and save them into the audio/wav
folder of the dataset.
After running it, the dataset should be ready for use.