You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
headbang-hud started off as groove-dashboard, inspired by this paper[[17]](#17), which associates audio signal features or MIR features to human judgements of groove. The paper defines groove as follows:
331
+
"Whenever listeners have the impulsion to bob their heads in synchrony with music, the groove phenomenon is at work."[[24]](#24)
327
332
328
-
>The experience of groove is associated with the urge to move to a musical rhythm
333
+
headbang-hud started off with the name "groove-dashboard", inspired by this paper[[17]](#17), which associates audio signal features or MIR features to human judgements of groove. The paper defines groove as follows:
329
334
330
-
From this definition I was inspired to look towards the field of computer vision and pose estimation to track the motion jointly with musical measures of groove.
335
+
>The experience of groove is associated with the urge to move to a musical rhythm
331
336
332
-
Strong beats are also associated with groove[[18]](#18), [[19]](#19), which ties in to the two beat tracking algorithms described previously.
337
+
From this definition I was inspired to look towards the field of computer vision and pose estimation to track headbanging head motion. Strong beats are also associated with groove[[18]](#18), [[19]](#19), which ties in to the two beat tracking algorithms described previously.
333
338
334
339
## 2D motion with OpenPose
335
340
@@ -498,14 +503,47 @@ The two-pass design was chosen out of necessity; keeping all of the frames of th
498
503
499
504
The last tool in the headbang.py project is `headbang-viz`. One of the best ways to verify beat tracking results is sonification of the beat annotations with clicks (demonstrated previously). This is trickier in a consensus or ensemble algorithm with multiple candidates. A visual animation would work better.
500
505
501
-
The chosen animation was to create bouncing numbers from 0-8 (representing the 6 individual beat trackers, the ConsensusBeatTracker, and the HeadbangBeatTracker), that bounce between two positions on the on- and off-beat locations. The implementation is something like this:
502
-
* Obtain beat locations (per algorithm), prepend 0 and append the end of the song (total duration). E.g., beats for a 3 second song = [0, 1.2, 1.5, 2.3, 2.6, 2.8, 3.0]
503
-
* Obtain off-beat locations by taking midpoints between every beat location. E.g, off-beats for the above = [0.6, 1.35, 1.9, 2.45, 2.7, 2.9]
504
-
* Convert timestamps (in seconds) to frame indices for the total frames of the video
505
-
* Create an array of positions for all the frames of the video, initialized to NaN
506
-
* Set beat locations to 1 and off-beat locations to -1, e.g. [1, NaN, NaN, -1, NaN, NaN, 1, NaN, NaN, -1, ...]
507
-
* Use pandas interpolate to replace the NaN values with interpolations - e.g. [-1, NaN, NaN, 1] becomes [-1, -0.6667, -0.3333, 1]
508
-
* Draw the beat tracker's location at `center + positions[frame]*offset` to make the beat tracker bounce between `center-offset` and `center+offset`
506
+
The chosen animation was to create bouncing numbers from 0-8 (representing the 6 individual beat trackers, the ConsensusBeatTracker, and the HeadbangBeatTracker), that bounce between two positions on the on- and off-beat locations. The implementation is as follows:
507
+
* Convert beat times (in seconds) to frames by finding the closest frame; prepend 0 and append the end of the song:
A potential use of `headbang-viz` is debugging some particularly tricky songs - for example, Periphery - Eureka was shown above as a difficult case:
511
549
@@ -542,12 +580,12 @@ J. Zapata, M. Davies and E. Gómez, "Multi-feature beat tracker," IEEE/ACM Trans
542
580
N. Degara, E. A. Rua, A. Pena, S. Torres-Guijarro, M. E. Davies, and M. D. Plumbley, "Reliability-informed beat tracking of musical signals," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 290–301, 2012. <https://www.eecs.qmul.ac.uk/~markp/2012/DegaraArgonesRuaPenaTDP12-taslp_accepted.pdf>
543
581
544
582
<aid="6">[6]</a>
545
-
Ellis, Daniel PW. “Beat tracking by dynamic programming.” Journal of New Music Research 36.1 (2007): 51-60. <http://labrosa.ee.columbia.edu/projects/beattrack/>
583
+
Ellis, Daniel PW, "Beat tracking by dynamic programming," Journal of New Music Research 36.1 (2007): 51-60. <http://labrosa.ee.columbia.edu/projects/beattrack/>
546
584
547
585
<aid="7">[7]</a>
548
-
Real-Time Beat-Synchronous Analysis of Musical Audio, A. M. Stark, M. E. P. Davies and M. D. Plumbley. In Proceedings of the 12th International Conference on Digital Audio Effects (DAFx-09), Como, Italy, September 1-4, 2009. <https://www.eecs.qmul.ac.uk/~markp/2009/StarkDaviesPlumbley09-dafx.pdf>
586
+
A. M. Stark, M. E. P. Davies and M. D. Plumbley, "Real-Time Beat-Synchronous Analysis of Musical Audio," Proceedings of the 12th International Conference on Digital Audio Effects (DAFx-09), Como, Italy, September 1-4, 2009. <https://www.eecs.qmul.ac.uk/~markp/2009/StarkDaviesPlumbley09-dafx.pdf>
549
587
550
-
<aid="8">[8]></a>
588
+
<aid="8">[8]</a>
551
589
J. R. Zapata, A. Holzapfel, M. E. Davies, J. L. Oliveira, and F. Gouyon, "Assigning a confidence threshold on automatic beat annotation in large datasets," in International Society for Music Information Retrieval Conference (ISMIR’12), 2012. <http://mtg.upf.edu/system/files/publications/Jose_Zapata_et_al_157_ISMIR_2012.pdf>
552
590
553
591
<aid="9">[9]</a>
@@ -557,7 +595,7 @@ Fitzgerald, Derry. (2010). Harmonic/Percussive Separation using Median Filtering
557
595
Driedger, Jonathan & Müller, Meinard & Disch, Sascha. (2014). Extending Harmonic-Percussive Separation of Audio Signals. <https://www.audiolabs-erlangen.de/content/05-fau/assistant/00-driedger/01-publications/2014_DriedgerMuellerDisch_ExtensionsHPSeparation_ISMIR.pdf>
558
596
559
597
<aid="11">[11]</a>
560
-
Gier, H & Paul White, "SPL Transient Designer, DUAL-CHANNEL, Model 9946, Manual". <https://spl.audio/wp-content/uploads/transient_designer_2_9946_manual.pdf>
598
+
Gier, H & White, P, "SPL Transient Designer, DUAL-CHANNEL, Model 9946, Manual". <https://spl.audio/wp-content/uploads/transient_designer_2_9946_manual.pdf>
561
599
562
600
<aid="12">[12]</a>
563
601
P. Masri and A. Bateman, “Improved modelling of attack transients in music analysis-resynthesis,” in Proceedings of the International Computer Music Conference, 1996, pp. 100–103. <http://hans.fugal.net/comps/papers/masri_1996.pdf>
@@ -578,10 +616,10 @@ Matthew E. P. Davies, Norberto Degara, and Mark D. Plumbley. "Evaluation Method
578
616
Stupacher, Jan & Hove, Michael & Janata, Petr. (2016). Audio Features Underlying Perceived Groove and Sensorimotor Synchronization in Music. Music Perception. 33. 571-589. 10.1525/mp.2016.33.5.571. <https://www.researchgate.net/publication/291351443_Audio_Features_Underlying_Perceived_Groove_and_Sensorimotor_Synchronization_in_Music>
579
617
580
618
<aid="18">[18]</a>
581
-
Madison G, Gouyon F, Ullen F. Musical groove is correlated with properties of the audio signal as revealed by computational modelling, depending on musical style. In: Proceedings of the SMC 2009—6th Sound and Music Computing Conference. 2009. p. 239–40. <https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.487.1456&rep=rep1&type=pdf>
619
+
Madison G, Gouyon F, Ullen F. "Musical groove is correlated with properties of the audio signal as revealed by computational modelling, depending on musical style." Proceedings of the SMC 2009—6th Sound and Music Computing Conference. 2009. p. 239–40. <https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.487.1456&rep=rep1&type=pdf>
582
620
583
621
<aid="19">[19]</a>
584
-
Madison G, Gouyon F, Ullén F, Hörnström K. Modeling the tendency for music to induce movement in humans: First correlations with low-level audio descriptors across music genres. J Exp Psychol Hum Percept Perform. 2011; 37:1578–1594. pmid:21728462. <https://www.researchgate.net/publication/51466595_Modeling_the_Tendency_for_Music_to_Induce_Movement_in_Humans_First_Correlations_With_Low-Level_Audio_Descriptors_Across_Music_Genres>
622
+
Madison G, Gouyon F, Ullén F, Hörnström K. "Modeling the tendency for music to induce movement in humans: First correlations with low-level audio descriptors across music genres." J Exp Psychol Hum Percept Perform. 2011; 37:1578–1594. pmid:21728462. <https://www.researchgate.net/publication/51466595_Modeling_the_Tendency_for_Music_to_Induce_Movement_in_Humans_First_Correlations_With_Low-Level_Audio_Descriptors_Across_Music_Genres>
585
623
586
624
<aid="20">[20]</a>
587
625
Quantifying music-dance synchrony with the application of a deep learning-based 2D pose estimator. Filip Potempski, Andrea Sabo, Kara K Patterson. bioRxiv 2020.10.09.333617; doi: https://doi.org/10.1101/2020.10.09.333617. <https://www.biorxiv.org/content/10.1101/2020.10.09.333617v1.full>
@@ -594,3 +632,6 @@ Schindler, Alexander. (2020). Multi-Modal Music Information Retrieval: Augmentin
594
632
595
633
<aid="23">[23]</a>
596
634
Fabrizio Pedersoli, Masataka Goto, "Dance Beat Tracking from Visual Information Alone", ISMIR 2020. <https://program.ismir2020.net/poster_3-10.html>
635
+
636
+
<aid="24">[24]</a>
637
+
Senn, Olivier, Lorenz Kilchenmann, T. Bechtold and Florian Hoesl. "Groove in drum patterns as a function of both rhythmic properties and listeners' attitudes." PLoS ONE 13 (2018). <https://www.semanticscholar.org/paper/Groove-in-drum-patterns-as-a-function-of-both-and-Senn-Kilchenmann/725d3ff0530338ee264adc665377fbe966fd6723>
\large{Final project abstract. MUMT 621, March 30, 2021}\\
38
+
\large{Sevag Hanssian, 260398537}
39
+
40
+
\noindent\hrulefill
41
+
42
+
\vspace{2em}
43
+
44
+
Beat tracking is a rich field of music information retrieval (MIR). The audio beat tracking task has been a part of MIREX since 2006,\footnote{\url{https://www.music-ir.org/mirex/wiki/2006:Audio_Beat_Tracking}} and receives submissions every year. Most recently, \textcite{bock1, bock2} have achieved state of the art results, and have released their algorithms in the open-source madmom Python library (\cite{madmom}).
45
+
46
+
The beat tracking algorithms in MIREX are evaluated against diverse and challenging beat tracking datasets (\cite{beatmeta}). However, in my personal experiments on my preferred genres of music -- mostly rhythmically-complex progressive metal (\cite{meshuggah, periphery}) -- I noticed that in several cases the beat locations output by the best algorithms did not feel correct.
47
+
48
+
For the first goal of my final project, I propose to explore various beat tracking algorithms and pre-processing techniques to create improved (perceptually better) beat results in progressive metal songs. The name of the project is ``headbang.py''; the ``.py'' suffix is because it will be a code project written in Python, and ``headbang'' refers to the act of headbanging, where metal musicians or fans violently move their head up and down to the beat of a metal song.
49
+
50
+
\textcite{groove} state that ``whenever listeners have the impulsion to bob their heads in synchrony with music, the groove phenomenon is at work.'' Other recent papers have used 2D human pose and motion estimation to associate dance movements with musical beats (\cite{pose1, pose2}). For the second goal of headbang.py, I propose to analyze headbanging motion in metal videos with the OpenPose 2D human pose estimation library. The results of the headbanging motion analysis can be displayed alongside the results of audio beat tracking, to compare and contrast both phenomena.
51
+
52
+
A preferred method for evaluating beat tracking results is to overlay clicks on predicted beats over the original track, or to \textit{sonify} the beat annotations.\footnote{\url{https://www.audiolabs-erlangen.de/resources/MIR/FMP/B/B_Sonification.html}} This helps a person to verify that the clicks line up with their own perception of beat locations in listening tests. However, sonic verification can get complicated when trying to compare competing beat trackers side by side. The final goal of headbang.py is to create a visual 2D animation to display and compare the outputs of multiple beat trackers simultaneously.
0 commit comments