Skip to content

Code for ACL 2021 paper "Controllable Open-ended Question Generation with A New Question Type Ontology".

Notifications You must be signed in to change notification settings

launchnlp/open-ended_question_ontology

 
 

Repository files navigation

Controllable Open-ended Question Generation with A New Question Type Ontology

Code for ACL 2021 paper "Controllable Open-ended Question Generation with A New Question Type Ontology".

Raw data

Question type annotation

Reddit

Our Yahoo dataset is based on the Yahoo Answer L6 dataset. After obtaining the license for the L6 dataset, please email Shuyang (caoshuy@umich.edu) with the proof of license attached to obtain the Yahoo dataset.

Data Preprocess

Preprocessed binarized Reddit data can be downloaded from here.

For data preprocessing, please refer to the README in data_preprocess.


Run our models

Our experiments are based on PyTorch 1.7.0 and Fairseq at commit 0db28cd. Newer versions of Fairseq might also work.

Please download the generation models from here and put them under $MODEL/generation_models. The binarized dataset should be under $DATA/binarized_data.

To convert the fairseq generation output to text, use convert_output.py:

python convert_output.py --generate-dir <result_dir>
JointGen
cd gen_scripts
./jointgen.sh $DATA/output/jointgen
ExplGen
cd gen_scripts
./explgen.sh $DATA/output/explgen
TplGen
cd gen_scripts
./tplgen_question_generation.sh $DATA/output/tplgen_question
ExplGen: conditioned on top 9 types
cd gen_scripts
./explgen_9types.sh $DATA/output/explgen_9types
TplGen: conditioned on top 9 types
cd gen_scripts
./tplgen_question_generation_9types.sh $DATA/output/tplgen_question_9types
TplGen: conditioned on top 9 types

Train our models

Please set BART_PATH as the path to the bart.large model, which can be downloaded here.

export BART_PATH=<path_to_bart_large_dir>/model.pt
JointGen
cd train_scripts
CUDA_VISIBLE_DEVICES=0,1 ./jointgen.sh $BART_PATH $MODEL/jointgen
ExplGen
cd train_scripts
CUDA_VISIBLE_DEVICES=0,1 ./explgen.sh $BART_PATH $MODEL/explgen
TplGen: template generation
cd train_scripts
CUDA_VISIBLE_DEVICES=0,1 ./tplgen_template_generation.sh $BART_PATH $MODEL/tplgen_template_generation
TplGen: question generation
cd train_scripts
CUDA_VISIBLE_DEVICES=0,1 ./tplgen_question_generation.sh $BART_PATH $MODEL/tplgen_question_generation

About

Code for ACL 2021 paper "Controllable Open-ended Question Generation with A New Question Type Ontology".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.1%
  • Shell 1.9%