This repository contains the code and resources from our paper.
-
bisect
: BiSECT dataset of complex-simple pairs. -
metrics
: Automatic evaluation metrics for the task that we used in our work. -
our_model
: Code for our hybrid splitter model. Further details to train or generate output from the pretrained models are available in a README file in the folder. -
outputs
: System outputs. -
experiments
: code for running human evaluation as well as MTurk output files
Each folder contains a README with further details.
For the BERT-intialized Transformer baseline, you can refer to this repo. All the pretrained models are available here.
Please cite if you use the above resources for your research
@inproceedings{bisect2021,
title={BiSECT: Learning to Split and Rephrase Sentences with Bitexts},
author={Kim, Joongwon and Maddela, Mounica and Kriz, Reno and Xu, Wei and Callison-Burch, Chris},
booktitle={Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)},
year={2021}
}