Fluent speech commands
To download the Fluent Speech command dataset, please read the full license carefully in Fluent Speech Commands Public License. Download the data from here and extract the data and put it under the foldernnsp/python/wavs/
as shown in Table 1.LibriSpeech 100-hour and 360-hour ASR corpus
LibriSpeech ASR corpus is used to make out-of-vocabulary (OOV) data and can be downloaded from here and here for 100-hour and 360-hour clean datasets, respectively. Download and extract the data and put it under the foldernnsp/python/wavs/garb/en/
as shown in Table 1.Qualcomm Keyword Speech Dataset
Please read the license carefully, which can be found here or here. There are 4 keywords available there. We only use Hi-Galaxy. Download and extract the data to the folder namedqualcomm_keyword_speech_dataset
and put it under the foldernnsp/python/wavs/kws/
as shown in Table 1.MUSAN dataset
The MUSAN (A Music, Speech, and Noise Corpus) dataset can be download from here. Download and extract the data to the folder namedmusan
and put it under the foldernnsp/python/wavs/noise/
as shown in Table 1.THCHS-30 dataset
THCHS30 is an open Chinese speech database published by Center for Speech and Language Technology (CSLT) at Tsinghua University. You can download the data from here. Download and extract the data to the folder nameddata_thchs30
and put it under the foldernnsp/python/wavs/garb/cn/
as shown in Table 1.
nnsp/ # root
evb/
ns-nnsp/
python/
wavs/
speakers/ # (1) Fluent speech commands dataset
garb/
en/LibriSpeech/
train-clean-100/ # (2)--LibriSpeech 100-hour ASR corpus
train-clean-360/ # (2)--LibriSpeech 360-hour ASR corpus
cn/data_thchs30/ # (5)--THCHS-30 dataset
kws/qualcomm_keyword_speech_dataset # (3)--Qualcomm Keyword Speech Dataset
noise/musan/ # (4)--MUSAN dataset
README.md
Table 1: Illustration of NNSP