Skip to content

Commit 037c392

Browse files
committed
add audio examples
1 parent 45c7947 commit 037c392

File tree

1 file changed

+7
-54
lines changed

1 file changed

+7
-54
lines changed

audio_examples.md

Lines changed: 7 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
layout: page
3-
title: Listening Test Examples
3+
title: Audio Examples
44
---
55
<head>
66
<style>
@@ -40,75 +40,28 @@ title: Listening Test Examples
4040

4141

4242
<div class="page">
43-
<h2>Listening Test - Realism Evaluation</h2>
44-
<p>We conducted a <strong>realism evaluation</strong> to focusing on how well the pitch expression and overall timbre of the synthesized samples matched those of actual violin performances. </p>
43+
<h2>Vocoder Quality</h2>
44+
<p> Here, we provide examples to illustrate the quality of the Soundstream vocoder, which sets an upper bound on the quality of our generated performances. </p>
4545

46-
<p>The evaluation was carried out using the MUSHRA <cite>{1}</cite> protocol, a standardized tool for subjective audio quality assessment. Participants rated multiple models, including:</p>
47-
48-
<ul>
49-
<li><strong>ViolinDiff</strong>: Our Proposed model</li>
50-
<li><strong>NoBend Model</strong>: Against a version of DiffSynth trained without pitch bend information</li>
51-
<li><strong>Hawthorne et al.</strong> <cite>{2}</cite>: A multi-instrument diffusion-based model focused on synthesizing various instruments in polyphonic contexts.</li>
52-
<li><strong>Maman et al.</strong> <cite>{3}</cite>: A model using performer embeddings to better capture timbre and style in orchestral instrument synthesis.</li>
53-
<li><strong>GM Soundfont (Low Anchor)</strong>: The lowest quality baseline, serving as an anchor point for comparison.</li>
54-
</ul>
55-
56-
<h3>References</h3>
57-
<ol>
58-
<li>ITU, “Method for the subjective assessment of intermediate quality level of audio systems,” BS.1534, 2014.</li>
59-
<li>Hawthorne et al.“Multi-instrument music synthesis with spectrogram diffusion,” in Proceedings of the ISMIR, 2022</li>
60-
<li>Maman et al. "Performance conditioning for diffusion-based multi-instrument music synthesis,” in ICASSP, 2024</li>
61-
</ol>
62-
63-
<p>We would like to extend our gratitude to Ben Maman for providing the essential test files, which greatly contributed to this evaluation.</p>
6446

6547

6648
<section class="audio-comparison">
6749
<h2>Kayser_Op20-36</h2>
6850
<div class="audio-container">
6951
<div class="audio-block">
70-
<p>Original Audio</p>
71-
<audio controls>
72-
<source src="original_segment_Kayser_Op20-36_performer_2.wav">
73-
Your browser does not support the audio element.
74-
</audio>
75-
</div>
76-
<div class="audio-block">
77-
<p>ViolinDiff</p>
78-
<audio controls>
79-
<source src="bend_segment_Kayser_Op20-36_performer_2.wav">
80-
Your browser does not support the audio element.
81-
</audio>
82-
</div>
83-
<div class="audio-block">
84-
<p>NoBend Model</p>
85-
<audio controls>
86-
<source src="no_bend_segment_Kayser_Op20-36_performer_2.wav">
87-
Your browser does not support the audio element.
88-
</audio>
89-
</div>
90-
<div class="audio-block">
91-
<p>Hawth.</p>
52+
<p>Original</p>
9253
<audio controls>
93-
<source src="magenta_segment_Kayser_Op20-36_performer_2.wav">
54+
<source src="/vocoder/Kayser_Op20-36_org_7.wav">
9455
Your browser does not support the audio element.
9556
</audio>
9657
</div>
9758
<div class="audio-block">
98-
<p>Maman.</p>
59+
<p>Vocoder</p>
9960
<audio controls>
100-
<source src="perfdiff_segment_Kayser_Op20-36_performer_2.wav">
61+
<source src="/vocoder/Kayser_Op20-36_vocoder_7.wav">
10162
Your browser does not support the audio element.
10263
</audio>
10364
</div>
104-
<div class="audio-block">
105-
<p>GM</p>
106-
<audio controls>
107-
<source src="window_gm_segment_Kayser_Op20-36_performer_2.wav">
108-
Your browser does not support the audio element.
109-
</audio>
110-
</div>
111-
</div>
11265
</section>
11366

11467
<section class="audio-comparison">

0 commit comments

Comments
 (0)