Mezza Giampiccolo Bernardini Sarti #5

ilic-mezza · 2023-09-28T17:00:24Z

Alessandro Ilic Mezza, Riccardo Giampiccolo, Alberto Bernardini, Augusto Sarti

Dear SDX workshop committee,

I request a review for the following SDX submission (Abstract: 250 Words)

As a participant of the SDX challenge, please know that your talk would be automatically accepted after a minimal prescreening. Hence, don't forget to provide us with the link to your team in the AIcrowd website.
You are also welcome to submit us some unrelated abstract, in case your interest would be in presenting some method or idea you would just like to discuss with the participants passing by.

Title: StemGMD: A Large-Scale Multi-Kit Audio Dataset for Deep Drums Demixing

Author(s): Alessandro Ilic Mezza, Riccardo Giampiccolo, Alberto Bernardini, and Augusto Sarti

Challenge submission (weather this submission is related to an sdx challenge entry):

Workshop participance

Virtual
On-Site

ORGANISATION COMMITTEE (do not fill out)

faroit · 2023-10-09T09:18:39Z

👋 @ilic-mezza thanks for your submission. We are going to start reviewing end of this week!

StefanUhlich-sony · 2023-10-18T12:47:19Z

Hello @ilic-mezza, thanks a lot for your submission, which looks interesting. For the abstract, it would be great if you could add the link to the StemGMD dataset and a link to your reference U-Net. Furthermore, could you please specify the nine stems that you consider and how you group them to five stems for the U-Nets that you trained?

Besides this, I have the following questions (not required for the abstract - I am just curious 😊):

Besides using sound fonts, there would also be the possibility to use a drum sample library to create such a dataset (e.g., https://www.toontrack.com/product/ezdrummer-3/) - did you compare the quality of your sound fonts to such a library?
Do the sound fonts already include reverberation? Did you add reverb yourself?
Did you add any compressor effect? How do you avoid clipping in the mix of the five/nine sources?
How did you train the stereo networks? Did you assume a specific drumset layout and then used a panning technique or is this stereo information already contained in your sound fonts?

ilic-mezza · 2023-10-18T17:20:22Z

Hello @TE-StefanUhlich, thanks for reviewing our submission.

In the updated paper.md file, we specified the nine stems in StemGMD (kick drum, snare, high tom, mid-low tom, floor tom, open hi-hat, closed hi-hat, ride cymbal, crash cymbal) and the five stems separated by the U-Nets (kick drum, snare, tom-toms, hi-hat, cymbals).

StemGMD happens to be very big (more than 1 TB). We are currently working with Zenodo to accommodate the hosting of such a large dataset. The link will be made available soon, and we will update the abstract accordingly.

The U-Net code is available here: https://github.com/polimi-ispl/larsnet

Besides using sound fonts, there would also be the possibility to use a drum sample library to create such a dataset (e.g., https://www.toontrack.com/product/ezdrummer-3/) - did you compare the quality of your sound fonts to such a library?

Thank you for pointing out the difference between "sound fonts" and "drum sample libraries." It appears that we used the term "sound font" inappropriately. In fact, StemGMD was created using drum sample libraries from Drum Kit Designer shipped with Logic Pro X, whose quality is (arguably) on par with that of EZDrummer. The abstract was updated to clarify this aspect.

Do the sound fonts already include reverberation? Did you add reverb yourself?

Yes, all tracks are sent to a bus where room reverberation is applied using sampled IRs. The IR varies across different drum kits.

Did you add any compressor effect? How do you avoid clipping in the mix of the five/nine sources?

Compression was applied as a data augmentation strategy in training the neural networks but was not used in creating StemGMD. When exporting StemGMD's audio clips, the level of each track was kept at -3 dB to avoid clipping the master channel.

Anyhow, we cannot exclude that compression (as well as other FXs) was applied to the drum sounds by the library's creator.

How did you train the stereo networks? Did you assume a specific drumset layout and then used a panning technique or is this stereo information already contained in your sound fonts?

The stereo information is already contained in the drum samples. However, we apply various data augmentation methods at training time, including doubling and L/R channel swap to mitigate the limitations of the fixed drumset layout that came with Drum Kit Designer.

StefanUhlich-sony

@ilic-mezza Thanks a lot for your answer - everything looks fine.

faroit

Hi @ilic-mezza for the final version of the abstract, can you please add the dataset url as a reference of footnote?

ilic-mezza · 2023-10-24T23:30:45Z

Hi @faroit

Sure, we will add the link to the final version of the abstract. As for now, we uploaded the dataset on Zenodo and we were assigned a DOI (10.5281/zenodo.7860223). However, the repository is still private as it is being prepared, and it does not have a URL yet.

Our plan is to make the dataset public by the end of the week once we're sure everything is in order. Then, we will proceed to update the abstract accordingly—would that be ok?

faroit · 2023-10-27T07:25:14Z

paper.md

+bank of parallel U-Nets that separates five stems (kick drum, snare, tom-toms, hi-hat, cymbals) from a stereo drum mixture through spectro-temporal soft masking. 
+Such model is meant to serve as a baseline for future research and might complement existing music demixing models.
+
+[^1] DOI: 10.5281/zenodo.7860223


It seems that the footnotes doesn't render nicely here.

Maybe you create a real reference and cite it here. Also it would be nice if the DOI is valid hyperlink that would directly link to zenodo

Ok, I will submit a new version with a real reference and a working hyperlink to zenodo. We are re-uploading the files on zenodo as I write, as we noticed a problem with the zip files we had already uploaded. Our hope is to be done by the end of the day; as soon as the dataset is online, I'll proceed and commit the new abstract.

faroit · 2023-10-27T08:24:05Z

Hi @ilic-mezza , as the workshop is coming up very soon, we are finalizing the programm. Stay tuned for the definite time-table for your presentations. In the meantime, we want to clarify the recording and broadcasting rights of the presentations, so please acknowledge the following:

I hereby authorize the right and permission to copyright and/or publish, reproduce or otherwise use my name, voice, and audio-visual recordings. I acknowledge and understand these materials about or of me may be used for both commercial and/or non-commercial purposes.

I understand that my image may be edited, copied, exhibited, published and/or distributed. There is no time limit on the validity of this release nor are there any geographic limitations on where these materials may be distributed.

I authorize that the video will be made available under the CC BY-SA 4.0 license on the conference website.

and comment on this by replying with acknowledge.

ilic-mezza · 2023-10-27T21:34:57Z

Hi @ilic-mezza , as the workshop is coming up very soon, we are finalizing the programm. Stay tuned for the definite time-table for your presentations. In the meantime, we want to clarify the recording and broadcasting rights of the presentations, so please acknowledge the following:

I hereby authorize the right and permission to copyright and/or publish, reproduce or otherwise use my name, voice, and audio-visual recordings. I acknowledge and understand these materials about or of me may be used for both commercial and/or non-commercial purposes.

I understand that my image may be edited, copied, exhibited, published and/or distributed. There is no time limit on the validity of this release nor are there any geographic limitations on where these materials may be distributed.

I authorize that the video will be made available under the CC BY-SA 4.0 license on the conference website.

and comment on this by replying with acknowledge.

acknowledge

ilic-mezza · 2023-10-27T21:43:11Z

Hi @faroit,

The dataset is finally up! We updated the abstract adding a proper reference with a Zenodo url and doi (10.5281/zenodo.7860223). Let us know if the BibTeX entry types renders well. In particular, we might have to change "@dataset" to "@misc".

Best,
Alessandro

faroit · 2023-10-27T22:29:42Z

@ilic-mezza just checked, looks good! Thanks

faroit · 2023-10-31T19:17:26Z

Hi @ilic-mezza,

the workshop is approaching soon, here are few info regarding your presentation:

The program is up and you can find your presentation slot here: https://sdx-workshop.github.io/program
If your presentation is virtual, please show up in the zoom webinar at least 10 minutes before your slot and use a distinct user-name so that we can identify you during the presentation
If your presentation is on site in Milano, please send us your pdf/pptx/web in advance and let us know if you need audio output. You can also present from your own device, but then test the setup before the workshop starts or during the coffee break. We don't know much about the projection setup right now, so bring adapters for standard HDMI/VGA in that case.
You have 15 minutes in total for your presentation including questions, so we suggest to prepare slides for about 12-13 minutes.
Please let us know if there is any problem with your presentation by tagging us here in the respective PR

See you soon!

ilic-mezza · 2023-10-31T20:33:19Z

Hi,

Thank you for accepting our submission. Looking forward to the workshop!

As you recently published the program, I noticed that the references in our abstract don't render as I'd expected. I will commit a new version of the bib file in case you'd like to update the pdf. However, it's nothing major so no worries if it turns out to be too much of an hassle. Also, I will take the opportunity to fix a small typo in the paper.md file.

Take care!

faroit · 2023-10-31T21:40:00Z

@ilic-mezza Sure, will update the pdf tomorrow

ilic-mezza · 2023-11-02T12:36:13Z

Hi @faroit, do you think you'll manage to update our abstract on the program page? I've checked the latest version and the pdf renders as intended. Thank you very much, see you at the workshop!

ilic-mezza · 2023-11-12T17:49:00Z

Hi @faroit, @TE-StefanUhlich,

We finally released both part 1 and part 2 of our dataset. On Zenodo, we reference our SDX Workshop abstract (Mezza.pdf). Therefore, it would be very nice if the program on the website showed the latest version of the abstract.

Would you mind updating it with the latest version of the PDF, i.e., the one complied after the last two commits?

Thank you very much!

ilic-mezza added 2 commits September 28, 2023 18:51

Update paper.md

2ec6202

Update paper.bib

5a81157

faroit added On Site 01 - Request labels Oct 16, 2023

faroit requested a review from StefanUhlich-sony October 17, 2023 10:06

faroit assigned StefanUhlich-sony Oct 17, 2023

faroit added the 02 - Review label Oct 17, 2023

Updated abstract after review

7db1bd9

StefanUhlich-sony approved these changes Oct 18, 2023

View reviewed changes

faroit added the 03 - Accepted label Oct 20, 2023

faroit requested changes Oct 23, 2023

View reviewed changes

Update abstract with DOI

78897a7

faroit requested changes Oct 27, 2023

View reviewed changes

Add dataset link as reference

f9fadbb

faroit approved these changes Oct 27, 2023

View reviewed changes

faroit added the 04 - Video consent label Oct 27, 2023

bib entry and typo fix

eb3c5cc

Minor fixes to paper.md

08c6de7

trigger rerun

9408064

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mezza Giampiccolo Bernardini Sarti #5

Mezza Giampiccolo Bernardini Sarti #5

ilic-mezza commented Sep 28, 2023

faroit commented Oct 9, 2023

StefanUhlich-sony commented Oct 18, 2023 •

edited

Loading

ilic-mezza commented Oct 18, 2023

StefanUhlich-sony left a comment

faroit left a comment

ilic-mezza commented Oct 24, 2023

faroit Oct 27, 2023 •

edited

Loading

ilic-mezza Oct 27, 2023

faroit commented Oct 27, 2023

ilic-mezza commented Oct 27, 2023

ilic-mezza commented Oct 27, 2023 •

edited

Loading

faroit commented Oct 27, 2023

faroit commented Oct 31, 2023

ilic-mezza commented Oct 31, 2023

faroit commented Oct 31, 2023

ilic-mezza commented Nov 2, 2023

ilic-mezza commented Nov 12, 2023

Mezza Giampiccolo Bernardini Sarti #5

Are you sure you want to change the base?

Mezza Giampiccolo Bernardini Sarti #5

Conversation

ilic-mezza commented Sep 28, 2023

Title: StemGMD: A Large-Scale Multi-Kit Audio Dataset for Deep Drums Demixing

faroit commented Oct 9, 2023

StefanUhlich-sony commented Oct 18, 2023 • edited Loading

ilic-mezza commented Oct 18, 2023

StefanUhlich-sony left a comment

Choose a reason for hiding this comment

faroit left a comment

Choose a reason for hiding this comment

ilic-mezza commented Oct 24, 2023

faroit Oct 27, 2023 • edited Loading

Choose a reason for hiding this comment

ilic-mezza Oct 27, 2023

Choose a reason for hiding this comment

faroit commented Oct 27, 2023

ilic-mezza commented Oct 27, 2023

ilic-mezza commented Oct 27, 2023 • edited Loading

faroit commented Oct 27, 2023

faroit commented Oct 31, 2023

ilic-mezza commented Oct 31, 2023

faroit commented Oct 31, 2023

ilic-mezza commented Nov 2, 2023

ilic-mezza commented Nov 12, 2023

StefanUhlich-sony commented Oct 18, 2023 •

edited

Loading

faroit Oct 27, 2023 •

edited

Loading

ilic-mezza commented Oct 27, 2023 •

edited

Loading