Skip to content

Commit 606ef05

Browse files
committed
Add more content, improve docs again
1 parent f5cca25 commit 606ef05

File tree

8 files changed

+336
-43
lines changed

8 files changed

+336
-43
lines changed

docs/index.md

+59-18
Original file line numberDiff line numberDiff line change
@@ -216,6 +216,11 @@ Here's a table of some interesting outputs of headbang's algorithms:
216216
<td>{% include embed-audio.html src="eureka_dbn.wav" %}</td>
217217
<td>{% include embed-audio.html src="eureka_hbt.wav" %}</td>
218218
</tr>
219+
<tr>
220+
<td><a href="https://www.youtube.com/watch?v=8saKHKt1A5Q">Animals as Leaders - Lippincott</a></td>
221+
<td>{% include embed-audio.html src="lippincott_dbn.wav" %}</td>
222+
<td>{% include embed-audio.html src="lippincott_hbt.wav" %}</td>
223+
</tr>
219224
</tbody>
220225
</table>
221226

@@ -323,13 +328,13 @@ usage: headbang-hud [-h] [--keypoints KEYPOINTS]
323328

324329
## Groove
325330

326-
headbang-hud started off as groove-dashboard, inspired by this paper[[17]](#17), which associates audio signal features or MIR features to human judgements of groove. The paper defines groove as follows:
331+
"Whenever listeners have the impulsion to bob their heads in synchrony with music, the groove phenomenon is at work."[[24]](#24)
327332

328-
>The experience of groove is associated with the urge to move to a musical rhythm
333+
headbang-hud started off with the name "groove-dashboard", inspired by this paper[[17]](#17), which associates audio signal features or MIR features to human judgements of groove. The paper defines groove as follows:
329334

330-
From this definition I was inspired to look towards the field of computer vision and pose estimation to track the motion jointly with musical measures of groove.
335+
>The experience of groove is associated with the urge to move to a musical rhythm
331336
332-
Strong beats are also associated with groove[[18]](#18), [[19]](#19), which ties in to the two beat tracking algorithms described previously.
337+
From this definition I was inspired to look towards the field of computer vision and pose estimation to track headbanging head motion. Strong beats are also associated with groove[[18]](#18), [[19]](#19), which ties in to the two beat tracking algorithms described previously.
333338

334339
## 2D motion with OpenPose
335340

@@ -498,14 +503,47 @@ The two-pass design was chosen out of necessity; keeping all of the frames of th
498503

499504
The last tool in the headbang.py project is `headbang-viz`. One of the best ways to verify beat tracking results is sonification of the beat annotations with clicks (demonstrated previously). This is trickier in a consensus or ensemble algorithm with multiple candidates. A visual animation would work better.
500505

501-
The chosen animation was to create bouncing numbers from 0-8 (representing the 6 individual beat trackers, the ConsensusBeatTracker, and the HeadbangBeatTracker), that bounce between two positions on the on- and off-beat locations. The implementation is something like this:
502-
* Obtain beat locations (per algorithm), prepend 0 and append the end of the song (total duration). E.g., beats for a 3 second song = [0, 1.2, 1.5, 2.3, 2.6, 2.8, 3.0]
503-
* Obtain off-beat locations by taking midpoints between every beat location. E.g, off-beats for the above = [0.6, 1.35, 1.9, 2.45, 2.7, 2.9]
504-
* Convert timestamps (in seconds) to frame indices for the total frames of the video
505-
* Create an array of positions for all the frames of the video, initialized to NaN
506-
* Set beat locations to 1 and off-beat locations to -1, e.g. [1, NaN, NaN, -1, NaN, NaN, 1, NaN, NaN, -1, ...]
507-
* Use pandas interpolate to replace the NaN values with interpolations - e.g. [-1, NaN, NaN, 1] becomes [-1, -0.6667, -0.3333, 1]
508-
* Draw the beat tracker's location at `center + positions[frame]*offset` to make the beat tracker bounce between `center-offset` and `center+offset`
506+
The chosen animation was to create bouncing numbers from 0-8 (representing the 6 individual beat trackers, the ConsensusBeatTracker, and the HeadbangBeatTracker), that bounce between two positions on the on- and off-beat locations. The implementation is as follows:
507+
* Convert beat times (in seconds) to frames by finding the closest frame; prepend 0 and append the end of the song:
508+
```python
509+
times_vector = numpy.arange(0, total_duration, frame_duration)
510+
511+
on_beat_frames = numpy.concatenate((
512+
numpy.zeros(1),
513+
find_closest(times_vector, beat_times),
514+
numpy.ones(1)*(total_frames-1),
515+
))
516+
```
517+
* Obtain off-beat locations by taking midpoints between every beat location:
518+
```python
519+
off_beat_frames = [
520+
((x[1:] + x[:-1]) / 2).astype(numpy.int) for x in on_beat_frames
521+
]
522+
```
523+
* Create an array of positions for all the frames of the video. Set beat locations to 1, off-beat locations to -1, and all other times to NaN:
524+
```python
525+
x = (
526+
numpy.empty(
527+
total_frames,
528+
)
529+
* numpy.nan
530+
)
531+
532+
x[on_beat_frames[i]] = 1
533+
x[off_beat_frames[i]] = -1
534+
```
535+
* Use pandas interpolate to replace the NaN values with interpolations:
536+
```python
537+
a = pd.Series(x)
538+
positions = a.interpolate().to_numpy()
539+
```
540+
* Draw the beat tracker visual marker around a center point offset by the beat positions per frame:
541+
```python
542+
current_position = (
543+
center[0],
544+
int(center[1] + (box_height / 2 - 100) * positions[frame]),
545+
)
546+
```
509547

510548
A potential use of `headbang-viz` is debugging some particularly tricky songs - for example, Periphery - Eureka was shown above as a difficult case:
511549

@@ -542,12 +580,12 @@ J. Zapata, M. Davies and E. Gómez, "Multi-feature beat tracker," IEEE/ACM Trans
542580
N. Degara, E. A. Rua, A. Pena, S. Torres-Guijarro, M. E. Davies, and M. D. Plumbley, "Reliability-informed beat tracking of musical signals," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 290–301, 2012. <https://www.eecs.qmul.ac.uk/~markp/2012/DegaraArgonesRuaPenaTDP12-taslp_accepted.pdf>
543581

544582
<a id="6">[6]</a>
545-
Ellis, Daniel PW. “Beat tracking by dynamic programming.” Journal of New Music Research 36.1 (2007): 51-60. <http://labrosa.ee.columbia.edu/projects/beattrack/>
583+
Ellis, Daniel PW, "Beat tracking by dynamic programming," Journal of New Music Research 36.1 (2007): 51-60. <http://labrosa.ee.columbia.edu/projects/beattrack/>
546584

547585
<a id="7">[7]</a>
548-
Real-Time Beat-Synchronous Analysis of Musical Audio, A. M. Stark, M. E. P. Davies and M. D. Plumbley. In Proceedings of the 12th International Conference on Digital Audio Effects (DAFx-09), Como, Italy, September 1-4, 2009. <https://www.eecs.qmul.ac.uk/~markp/2009/StarkDaviesPlumbley09-dafx.pdf>
586+
A. M. Stark, M. E. P. Davies and M. D. Plumbley, "Real-Time Beat-Synchronous Analysis of Musical Audio," Proceedings of the 12th International Conference on Digital Audio Effects (DAFx-09), Como, Italy, September 1-4, 2009. <https://www.eecs.qmul.ac.uk/~markp/2009/StarkDaviesPlumbley09-dafx.pdf>
549587

550-
<a id="8">[8]></a>
588+
<a id="8">[8]</a>
551589
J. R. Zapata, A. Holzapfel, M. E. Davies, J. L. Oliveira, and F. Gouyon, "Assigning a confidence threshold on automatic beat annotation in large datasets," in International Society for Music Information Retrieval Conference (ISMIR’12), 2012. <http://mtg.upf.edu/system/files/publications/Jose_Zapata_et_al_157_ISMIR_2012.pdf>
552590

553591
<a id="9">[9]</a>
@@ -557,7 +595,7 @@ Fitzgerald, Derry. (2010). Harmonic/Percussive Separation using Median Filtering
557595
Driedger, Jonathan & Müller, Meinard & Disch, Sascha. (2014). Extending Harmonic-Percussive Separation of Audio Signals. <https://www.audiolabs-erlangen.de/content/05-fau/assistant/00-driedger/01-publications/2014_DriedgerMuellerDisch_ExtensionsHPSeparation_ISMIR.pdf>
558596

559597
<a id="11">[11]</a>
560-
Gier, H & Paul White, "SPL Transient Designer, DUAL-CHANNEL, Model 9946, Manual". <https://spl.audio/wp-content/uploads/transient_designer_2_9946_manual.pdf>
598+
Gier, H & White, P, "SPL Transient Designer, DUAL-CHANNEL, Model 9946, Manual". <https://spl.audio/wp-content/uploads/transient_designer_2_9946_manual.pdf>
561599

562600
<a id="12">[12]</a>
563601
P. Masri and A. Bateman, “Improved modelling of attack transients in music analysis-resynthesis,” in Proceedings of the International Computer Music Conference, 1996, pp. 100–103. <http://hans.fugal.net/comps/papers/masri_1996.pdf>
@@ -578,10 +616,10 @@ Matthew E. P. Davies, Norberto Degara, and Mark D. Plumbley. "Evaluation Method
578616
Stupacher, Jan & Hove, Michael & Janata, Petr. (2016). Audio Features Underlying Perceived Groove and Sensorimotor Synchronization in Music. Music Perception. 33. 571-589. 10.1525/mp.2016.33.5.571. <https://www.researchgate.net/publication/291351443_Audio_Features_Underlying_Perceived_Groove_and_Sensorimotor_Synchronization_in_Music>
579617

580618
<a id="18">[18]</a>
581-
Madison G, Gouyon F, Ullen F. Musical groove is correlated with properties of the audio signal as revealed by computational modelling, depending on musical style. In: Proceedings of the SMC 2009—6th Sound and Music Computing Conference. 2009. p. 239–40. <https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.487.1456&rep=rep1&type=pdf>
619+
Madison G, Gouyon F, Ullen F. "Musical groove is correlated with properties of the audio signal as revealed by computational modelling, depending on musical style." Proceedings of the SMC 2009—6th Sound and Music Computing Conference. 2009. p. 239–40. <https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.487.1456&rep=rep1&type=pdf>
582620

583621
<a id="19">[19]</a>
584-
Madison G, Gouyon F, Ullén F, Hörnström K. Modeling the tendency for music to induce movement in humans: First correlations with low-level audio descriptors across music genres. J Exp Psychol Hum Percept Perform. 2011; 37:1578–1594. pmid:21728462. <https://www.researchgate.net/publication/51466595_Modeling_the_Tendency_for_Music_to_Induce_Movement_in_Humans_First_Correlations_With_Low-Level_Audio_Descriptors_Across_Music_Genres>
622+
Madison G, Gouyon F, Ullén F, Hörnström K. "Modeling the tendency for music to induce movement in humans: First correlations with low-level audio descriptors across music genres." J Exp Psychol Hum Percept Perform. 2011; 37:1578–1594. pmid:21728462. <https://www.researchgate.net/publication/51466595_Modeling_the_Tendency_for_Music_to_Induce_Movement_in_Humans_First_Correlations_With_Low-Level_Audio_Descriptors_Across_Music_Genres>
585623

586624
<a id="20">[20]</a>
587625
Quantifying music-dance synchrony with the application of a deep learning-based 2D pose estimator. Filip Potempski, Andrea Sabo, Kara K Patterson. bioRxiv 2020.10.09.333617; doi: https://doi.org/10.1101/2020.10.09.333617. <https://www.biorxiv.org/content/10.1101/2020.10.09.333617v1.full>
@@ -594,3 +632,6 @@ Schindler, Alexander. (2020). Multi-Modal Music Information Retrieval: Augmentin
594632

595633
<a id="23">[23]</a>
596634
Fabrizio Pedersoli, Masataka Goto, "Dance Beat Tracking from Visual Information Alone", ISMIR 2020. <https://program.ismir2020.net/poster_3-10.html>
635+
636+
<a id="24">[24]</a>
637+
Senn, Olivier, Lorenz Kilchenmann, T. Bechtold and Florian Hoesl. "Groove in drum patterns as a function of both rhythmic properties and listeners' attitudes." PLoS ONE 13 (2018). <https://www.semanticscholar.org/paper/Groove-in-drum-patterns-as-a-function-of-both-and-Senn-Kilchenmann/725d3ff0530338ee264adc665377fbe966fd6723>

docs/lippincott_dbn.wav

5.05 MB
Binary file not shown.

docs/lippincott_hbt.wav

5.05 MB
Binary file not shown.

headbang/hud_tool.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -344,7 +344,7 @@ def process_second_pass(get_frame_fn, frame_time):
344344
print("run a gc, just in case...")
345345
gc.collect()
346346

347-
out_clip_tmp = VideoFileClip(tmp_mp4)
347+
out_clip_tmp = VideoFileClip(tmp_mp4, fps_source="fps")
348348
out_clip2 = out_clip_tmp.fl(process_second_pass)
349349

350350
audio_clip = AudioFileClip(video_path)

headbang/viz_tool.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ def main():
7777
((x[1:] + x[:-1]) / 2).astype(numpy.int) for x in all_beat_frames
7878
]
7979

80-
all_positions = [] # []
80+
all_positions = []
8181
for i in range(len(all_beat_frames)):
8282
x = (
8383
numpy.empty(
+62
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
\documentclass[letter,12pt]{report}
2+
%\setlength{\parindent}{0pt}
3+
\usepackage[left=2cm, right=2cm, top=2cm, bottom=2cm]{geometry}
4+
\usepackage[shortlabels]{enumitem}
5+
\usepackage{graphicx}
6+
\usepackage{amsmath}
7+
\usepackage{amssymb}
8+
\usepackage{verbatim}
9+
\usepackage{listings}
10+
\usepackage{minted}
11+
\usepackage{subfig}
12+
\usepackage{titling}
13+
\usepackage{caption}
14+
\setlength{\droptitle}{1cm}
15+
\usepackage{hyperref}
16+
\hypersetup{
17+
colorlinks,
18+
citecolor=black,
19+
filecolor=black,
20+
linkcolor=black,
21+
urlcolor=black
22+
}
23+
\usepackage{setspace}
24+
\renewcommand{\topfraction}{0.85}
25+
\renewcommand{\textfraction}{0.1}
26+
\renewcommand{\floatpagefraction}{0.75}
27+
\usepackage[backend=biber,authordate]{biblatex-chicago}
28+
\addbibresource{citations.bib}
29+
\usepackage{titlesec}
30+
31+
\titleformat{\chapter}[display]
32+
{\normalfont\bfseries}{}{0pt}{\Huge}
33+
34+
\begin{document}
35+
36+
\noindent\Large{\textbf{headbang.py}}\\
37+
\large{Final project abstract. MUMT 621, March 30, 2021}\\
38+
\large{Sevag Hanssian, 260398537}
39+
40+
\noindent\hrulefill
41+
42+
\vspace{2em}
43+
44+
Beat tracking is a rich field of music information retrieval (MIR). The audio beat tracking task has been a part of MIREX since 2006,\footnote{\url{https://www.music-ir.org/mirex/wiki/2006:Audio_Beat_Tracking}} and receives submissions every year. Most recently, \textcite{bock1, bock2} have achieved state of the art results, and have released their algorithms in the open-source madmom Python library (\cite{madmom}).
45+
46+
The beat tracking algorithms in MIREX are evaluated against diverse and challenging beat tracking datasets (\cite{beatmeta}). However, in my personal experiments on my preferred genres of music -- mostly rhythmically-complex progressive metal (\cite{meshuggah, periphery}) -- I noticed that in several cases the beat locations output by the best algorithms did not feel correct.
47+
48+
For the first goal of my final project, I propose to explore various beat tracking algorithms and pre-processing techniques to create improved (perceptually better) beat results in progressive metal songs. The name of the project is ``headbang.py''; the ``.py'' suffix is because it will be a code project written in Python, and ``headbang'' refers to the act of headbanging, where metal musicians or fans violently move their head up and down to the beat of a metal song.
49+
50+
\textcite{groove} state that ``whenever listeners have the impulsion to bob their heads in synchrony with music, the groove phenomenon is at work.'' Other recent papers have used 2D human pose and motion estimation to associate dance movements with musical beats (\cite{pose1, pose2}). For the second goal of headbang.py, I propose to analyze headbanging motion in metal videos with the OpenPose 2D human pose estimation library. The results of the headbanging motion analysis can be displayed alongside the results of audio beat tracking, to compare and contrast both phenomena.
51+
52+
A preferred method for evaluating beat tracking results is to overlay clicks on predicted beats over the original track, or to \textit{sonify} the beat annotations.\footnote{\url{https://www.audiolabs-erlangen.de/resources/MIR/FMP/B/B_Sonification.html}} This helps a person to verify that the clicks line up with their own perception of beat locations in listening tests. However, sonic verification can get complicated when trying to compare competing beat trackers side by side. The final goal of headbang.py is to create a visual 2D animation to display and compare the outputs of multiple beat trackers simultaneously.
53+
54+
\vfill
55+
\clearpage
56+
57+
\nocite{*}
58+
\printbibheading[title={\vspace{-3.5em}References},heading=bibnumbered]
59+
\vspace{-1.5em}
60+
\printbibliography[heading=none]
61+
62+
\end{document}

0 commit comments

Comments
 (0)