Skip to content

SunnyCYC/dPLP

Repository files navigation

dPLP: A Differentiable Version of Predominant Local Pulse Estimation

This is the repository of the ISMIR paper, dPLP: A Differentiable Version of Predominant Local Pulse Estimation. In the following, we provide the scripts for training the models, inferencing, peak-picking, and evaluation.

Organization of Folders

  • Datasets: as mentioned in the paper, this repo also includes 100 pop tracks from GTZAN[1] for the experiments, organized as follows:
    • dataset/gtzan:
      • downbeats: the beat/downbeat annotations
      • audio: please upload the gtzan audio files to this folder following the naming of ./dataset/gtzan/audio/audio_files.txt
      • train-info: the train/test/valid splits
  • Scripts: All .py scripts can be run directly at this project folder. See the following sections for the orders and purposes of them.

Training:

  • Run the following scripts following the order, you may train your own models of SFX, M1, M2, M3. These scripts will create a folder at ./experiments/ and save the trained models. One version of the pretrained models are also provided at ./pretrained
  • train-sfx.py: train the spectral flux model using the train/test/valid split in dataset/gtzan/train-info.
  • train-M1.py: train M1 model using the same train/test/valid split mentioned above.
  • train-M2.py: train the M2 model using the same train/test/valid split mentioned above. Note that M2 requires a pretrained SFX (directory should be specified at line 328).
  • train-M3.py: train the spectral flux model using the same train/test/valid split mentioned above.

Inferencing:

  • The following scripts will create a folder at dataset/gtzan/novelty save the corresponding generated novelty functions.
  • inference-dSFX.py: derive beat novelty functions using the initial/trained dSFX models.
  • inference-dPLP.py: derive beat novelty functions using the trained M1, M2, and M3 models.

Generate argmax and softmax PLP curves for comparison:

genPLP4comparison.py: This script reads the novelty functions generated by the trained SFX model (using inference-dSFX.py) and generate argmax PLP and softmax PLP curves. The generated PLP curves will be saved in a newly created folder at ./datasets/gtzan/plp-curves.

Peak picking to derive beat estimates:

  • The following scripts read novelty functions, apply peak picking to derive beat estimates.
  • peak-picking-plpcompare.py: This script reads all PLP curves saved at ./datasets/gtzan/plp-curves, applies peak picking, and saves the beat estimates at ./datasets/gtzan/beat_estimations_plpcompare.
  • peak-picking.py: This script reads all novelty functions saved in ./dataset/gtzan/novelty and applies peak picking to derive corresponding beat estimates. Estimates will be saved to ./dataset/gtzan/beat_estimations.

Evaluation

evaluate.py: This script reads beat estimates and save all the result at ./evaluation accordingly. You may modify the directories to evaluate beat estimates of argmax/softmax PLP.

License

This project is licensed under the MIT License - see the LICENSE file for details.

References

[1] G. Tzanetakis and P. Cook, “Musical genre classification of audio signals,” IEEE Trans. Speech and Audio Processing, vol. 10, no. 5, pp. 293–302, 2002.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages