Skip to content

Latest commit

 

History

History
37 lines (34 loc) · 4.83 KB

README.md

File metadata and controls

37 lines (34 loc) · 4.83 KB

Evaluation

We evaluate our paper with four evaluation datasets. datasets directory contains evaluation datasets. models directory contains the inference results and inference codes for each model.

If you want to get the same results as in the paper, run evaluate.py in the same directory. Comment out the end of the file accordingly. (The cache directory contains the audio length data, so that you don't have to download datasets just to show the evaluation results.)

If you wish to evaluate your own model or infer the results from scratch, follow these steps.

  1. Get datasets
    • Some data is not in this repository due to rights issues, however, you can download them.
    • Petridis2013
      1. Download the 130 audio files for which Spontaneous Laughter Filter is Yes and save it in evaluation/datasets/Petridis2013/audio.
      • The annotation data is created as follows. Since the Ground Truth data is already contained in the gt directory, you basically do not need to do these things.
      1. Download anotation data from here and here.
      2. Copy VoicedLaughter, Speechlaughter, and PosedLaughter into one csv without any header.
      3. Save it as annotations.csv in evaluation/datasets/Petridis2013/original_anotation_data.
      4. Run extract_laughter.py in the dataset directory.
    • McCowan2005
      • This is an evaluation dataset from AMI Corpus. You need to download audio.
      1. Download Headset mix audios in list of SC(ES2004, ES2014, IS1009, TS3003, TS3007, EN2002) of "Full-corpus partition of meeting" in this page. Download from here and save all to ./audio without any directory hierarchy.
      • The annotation data is collected as follows. Since the Ground Truth data is already contained in the gt directory, you basically do not need to do these things.
      1. Download "AMI manual annotations v1.6.2" from https://groups.inf.ed.ac.uk/ami/download/
      2. Extract and copy words directory to ./original_anotation_data.
      3. Run extract_laughter.py in the dataset directory.
    • Gillick2021
      1. Download eval_segments.csv, balanced_train_segments.csv, and unbalanced_train_segments.csv from AudioSet website.
      2. Download clean_distractor_annotations.csv, and clean_laughter_annotations.csv from GitHub (Alternatively, it is automatically included when the repository is cloned, as described below.).
      3. Download audio from YouTube. VideoID is written in csv from GitHub, and time is written in csv from AudioSet. Or you can refer to gt directory to see which audio you need (time info isn't written). For various reasons, some video is not available.
      • The annotation data is created as follows. Since the Ground Truth data is already contained in the gt directory, you basically do not need to do these things.
      1. Get laughter segmentation data from clean_laughter_annotations.csv, and convert it in json format. For data from clean_distractor_annotations.csv, simply generate an empty json file. See the gt directory for details.
    • Ours
      • This is our evaluation dataset based on Spotify Podcast Dataset. You need to download audio.
      1. Download audio from Here, extract and save it to audio directory. Make sure the audio directory contains laugh and non_laugh directories.
      • The annotation data is created manually. Contains 201 each of data with and without laughter. See paper for details.
  2. Infer with models. Run infer.py in models/{model_name} directory. Comment out the end of the file accordingly. See paper for details on each model. You need jrgillick/laughter-detection to infer previous study models. Run git clone https://github.com/jrgillick/laughter-detection.git in the top directory (where requirements.txt exists).
  3. Run evaluate.py in this directory. Comment out the end of the file accordingly.