Skip to content
This repository has been archived by the owner on Aug 31, 2021. It is now read-only.

Latest commit

 

History

History
133 lines (106 loc) · 4.21 KB

README.md

File metadata and controls

133 lines (106 loc) · 4.21 KB

Diagnostic Audio Question Answering (DAQA)

Code to generate the DAQA dataset as described in Temporal Reasoning via Audio Question Answering.

The generation process comprises two steps: (1) generate audio clips and descriptions; then (2) generate questions and answers.

Setup

Requirements. Make sure the requirements listed in the parent directory are installed. See this tutorial for detailed instructions. You can set up a virtual environment as follows.

virtualenv -p python3 .env # Create virtual environment
source .env/bin/activate # Activate virtual environment
pip install -r requirements.txt # Install dependencies
# Get things done.
deactivate # Exit virtual environment

Audio Files. There are two types of audio clips: (1) recorded audio; and (2) audio from AudioSet.

Download the recoded audio.

wget https://dl.fbaipublicfiles.com/daqa/daqa-audio.tar.gz

The youtube IDs and metadata as described in AudioSet are listed in daqa_sources.json. Note that daqa_sources.json contains the youtube IDs as well as the list of the recorded audio. These are differentiated by using the key url for youtube IDs and dir for recorded audio.

Symlinks. Set some symlinks.

daqa_gen=../daqa-gen
daqa_gen_local=../daqa-gen-local
daqa_dir=../daqa-dataset

Generate Audio

The following three commands should be used to generate the audio clips for the training, validation, and test sets respectively.

python3 $daqa_gen/generate_audio.py \
  --seed 0 \
  --num_audio 80000 \
  --set train \
  --dataset $daqa_gen/daqa.json \
  --events $daqa_gen_local/events \
  --backgrounds $daqa_gen_local/backgrounds \
  --output_audio_dir $daqa_dir/audio/train/ \
  --output_narrative_dir $daqa_dir/narratives/train/ \
  --output_narrative_file $daqa_dir/daqa_train_narratives.json
python3 $daqa_gen/generate_audio.py \
  --seed 1 \
  --num_audio 10000 \
  --set val \
  --dataset $daqa_gen/daqa.json \
  --events $daqa_gen_local/events \
  --backgrounds $daqa_gen_local/backgrounds \
  --output_audio_dir $daqa_dir/audio/val/ \
  --output_narrative_dir $daqa_dir/narratives/val/ \
  --output_narrative_file $daqa_dir/daqa_val_narratives.json
python3 $daqa_gen/generate_audio.py \
  --seed 2 \
  --num_audio 10000 \
  --set test \
  --dataset $daqa_gen/daqa.json \
  --events $daqa_gen_local/events \
  --backgrounds $daqa_gen_local/backgrounds \
  --output_audio_dir $daqa_dir/audio/test/ \
  --output_narrative_dir $daqa_dir/narratives/test/ \
  --output_narrative_file $daqa_dir/daqa_test_narratives.json

Generate Questions and Answers

The following three commands should be used to generate the questions and answers for the training, validation, and test sets respectively.

python3 $daqa_gen/generate_questions_answers.py \
  --seed 0 \
  --dataset $daqa_gen/daqa.json \
  --input_narrative_file $daqa_dir/daqa_train_narratives.json \
  --set train \
  --num_questions_per_narrative 5 \
  --output_qa_file $daqa_dir/daqa_train_questions_answers_5.json
python3 $daqa_gen/generate_questions_answers.py \
  --seed 1 \
  --dataset $daqa_gen/daqa.json \
  --input_narrative_file $daqa_dir/daqa_val_narratives.json \
  --set val \
  --num_questions_per_narrative 10 \
  --output_qa_file $daqa_dir/daqa_val_questions_answers.json
python3 $daqa_gen/generate_questions_answers.py \
  --seed 2 \
  --dataset $daqa_gen/daqa.json \
  --input_narrative_file $daqa_dir/daqa_test_narratives.json \
  --set test \
  --num_questions_per_narrative 10 \
  --output_qa_file $daqa_dir/daqa_test_questions_answers.json

License

Code is released under the CC-BY 4.0 license. See LICENSE for additional details.

Citation

If you find this code useful in your research, please cite:

@inproceedings{fayek2019temporal,
  title = {Temporal Reasoning via Audio Question Answering},
  author = {Haytham M. Fayek and Justin Johnson},
  year = {2019},
}