Skip to content
gabolsgabs edited this page Aug 8, 2018 · 13 revisions

Welcome to the DALI dataset: a large Dataset of synchronised Audio, LyrIcs and vocal notes. You can find a detailed explanation of how DALI has been created at

G. Meseguer-Brocal, A. Cohen-Hadria and G. Peeters. DALI: a large Dataset of synchronized Audio, LyrIcs and notes, automatically created using teacher-student machine learning paradigm. In ISMIR Paris, France, 2018.


(C1) Corpus ID: corpus:MIR:DALI:Vocal:2018:version1.0


(A) Raw Corpus
(A1) Definition:
(A2) Type of media diffusion:.


(B) Annotations
(B1) Origin:
(B21) Concepts definition:
(B22) Annotation rules:
(B31) Annotators:
(B32) Validation/ reliability:
(B4) Annotation tools:


(C) Documents and Storing
(C1) Audio identifier and storage:.

Clone this wiki locally