Fluent speech commands
To download the Fluent Speech command dataset, please read the full license carefully in Fluent Speech Commands Public License. Download the data from here and extract the data and put it under the folder nnsp/python/wavs/ as shown in Table 1.
LibriSpeech 100-hour and 360-hour ASR corpus
LibriSpeech ASR corpus is used to make out-of-vocabulary (OOV) data and can be downloaded from here and here for 100-hour and 360-hour clean datasets, respectively. Download and extract the data and put it under the folder nnsp/python/wavs/garb/en/ as shown in Table 1.
Qualcomm Keyword Speech Dataset
Please read the license carefully, which can be found here or here. There are 4 keywords available there. We only use Hi-Galaxy. Download and extract the data to the folder named qualcomm_keyword_speech_dataset and put it under the folder nnsp/python/wavs/kws/ as shown in Table 1.
MUSAN dataset
The MUSAN (A Music, Speech, and Noise Corpus) dataset can be download from here. Download and extract the data to the folder named musan and put it under the folder nnsp/python/wavs/noise/ as shown in Table 1.
THCHS-30 dataset
THCHS30 is an open Chinese speech database published by Center for Speech and Language Technology (CSLT) at Tsinghua University. You can download the data from here. Download and extract the data to the folder named data_thchs30 and put it under the folder nnsp/python/wavs/garb/cn/ as shown in Table 1.

nnsp/ # root 
    evb/ 
    ns-nnsp/  
    python/   
        wavs/
            speakers/ # (1) Fluent speech commands dataset 
            garb/
                en/LibriSpeech/
                    train-clean-100/ # (2)--LibriSpeech 100-hour ASR corpus
                    train-clean-360/ # (2)--LibriSpeech 360-hour ASR corpus
                cn/data_thchs30/    # (5)--THCHS-30 dataset
            kws/qualcomm_keyword_speech_dataset # (3)--Qualcomm Keyword Speech Dataset
            noise/musan/ # (4)--MUSAN dataset
    README.md

Table 1: Illustration of NNSP

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Files

README.md

Latest commit

History

README.md

File metadata and controls