Twist Decoding: Diverse Generators Guide Each Other

Introduction

Many language generation models have different settings, such as vocabularies, tokenization, and generation order, so they can't be simply ensembled. Our Twist decoding combines models regardless of such differences without any additional training or finetuning.

Installation

We forked the fairseq library and incorporated distance terms to their beam implementation. You can incorporate this in any implementation of beam search, but here we provide the codebase that we used for our paper. To run experiments, follow the fairseq instructions and run in this repository:

cd fairseq
pip install --editable .
python setup.py build_ext --inplace

Download Our Models and Data

Any fairseq seq-to-seq model should work, but here we provide all models we used in our experiments. See our paper for the training details.

Models
DE-EN Generic¹	DE-EN Medicine	DE-EN Law	DE-EN Koran	DE-EN Subtitles
ZH-EN L2R	ZH-EN R2L	EN-DE L2R	EN-DE R2L
SciTLDR Abstract²	SciTLDR AIC²

1: WMT19 top-performing model. Downloaded from the fairseq repository.
2: Downloaded from the official repository of the SciTLDR dataset (Cachola et al., 2020).

Datasets
DE-EN Medicine³	DE-EN Law³	DE-EN Koran³	EN-DE Subtitles³	WMT20 ZH-EN⁴	WMT20 EN-DE⁴

3: Downloaded from the official repository of Hu et al. (2019).
4: Downloaded from the official repository of the bidimensional leaderboards (Kasai et al., 2022).

Decode Domain and Generic Models

Here are some example commands. Run Twist decoding with f=Domain and g=Generic in the medical domain. They are separated by a colon in options: f:g. Run Moses detokenization after.

cd fairseq/
python twist/generate_twist.py --model-dirs  <PATH>/trans-base_medicine-de-en/:<PATH>/wmt19.de-en.joined-dict/ --model-names model.pt:model.pt --out-file mt/domains/medicine/output/test.twist --r2l 0:0 --src-lang de --tgt-lang en --in-file mt/domains/medicine/src/emea-test.tok.de --batch-size 20 --max-updates 3 --lmd-g 0.3 --lmd-f 0.1
perl <PATH>/mosesdecoder/scripts/tokenizer/detokenizer.perl -l en < mt/domains/medicine/output/test.twist_update-2.out > mt/domains/medicine/output/test.twist_update-2.txt

Run Twist decoding with f=Generic and g=Domain in the legal domain.

python twist/generate_twist.py --model-dirs  <PATH>/wmt19.de-en.joined-dict/:<PATH>/trans-base_law-de-en/ --model-names model.pt:model.pt --out-file mt/domains/law/output/test.twist --r2l 0:0 --src-lang de --tgt-lang en --in-file mt/domains/law/src/acquis-test.tok.de --batch-size 20 --max-updates 3 --lmd-g 3.0 --lmd-f 0.1

Run the reranking baseline.

python twist/generate_rerank.py --model-dirs  <PATH>/trans-base_medicine-de-en/:<PATH>/wmt19.de-en.joined-dict/ --model-names model.pt:model.pt --out-file mt/domains/medicine/output/test.rerank.out --r2l 0:0 --src-lang de --tgt-lang en --in-file mt/domains/medicine/src/emea-test.tok.de --batch-size 20

Decode Left-to-Right and Right-to-Left Models

The command is similar, but we pass the --r2l option.

python twist/generate_twist.py  --model-dirs <PATH>/trans-large-r2l_wmt20-zh-en/:<PATH>/trans-large-l2r_wmt20-zh-en/ --model-names model.pt:model.pt --out-file mt/wmt/zh-en/output/test.twist --r2l 1:0 --src-lang zh --tgt-lang en --in-file mt/wmt/zh-en/src/newstest2020.zh-en.src.tok.zh --max-updates 3 --lmd-g 3.0 --lmd-f 0.1 --batch-size 20

Paper Summarization

Here are some example commands. Run Twist decoding with f=AIC (abstract, introduction, and conclusion) and g=Abstract.

python twist/generate_twist_tldr.py --checkpoint-dirs <PATH>/scitldr_catts-xsum.tldr-aic/:<PATH>/scitldr_bart.tldr-ao/ --data-dirs summ/scitldr/SciTLDR-AIC/ctrl:summ/scitldr/SciTLDR-A/ctrl --checkpoint-files scitldr_catts-xsum.tldr-aic.pt:scitldr_bart.tldr-ao.pt --max-updates 3 --batch-size 1 --split test --beam 5 --lmd-g 3.0 --lmd-f 0.3 --batch-size 1 --out-file summ/scitldr/output/test.twist

Run the reranking baseline.

python twist/generate_rerank_tldr.py --checkpoint-dirs <PATH>/scitldr_catts-xsum.tldr-aic/:<PATH>/scitldr_bart.tldr-ao --data-dirs summ/scitldr/SciTLDR-AIC/ctrl:summ/scitldr/SciTLDR-A/ctrl --checkpoint-files scitldr_catts-xsum.tldr-aic.pt:scitldr_bart.tldr-ao.pt --batch-size 1 --split test --beam 5 --batch-size 1 --out-file summ/scitldr/output/test.rerank.txt

Evaluate Results

Lastly, we provide tools for evaluations: COMET for machine translation and ROUGE for summarization. Use the sacrebleu library to measure the BLEU score. For example,

cd eval/COMET/
bash run.sh  ../../fairseq/mt/domains/medicine/src/emea-test.de ../../fairseq/mt/domains/medicine/output/test.twist_update-2.txt ../../fairseq/mt/domains/medicine/tgt/emea-test.en.jsonl ../../fairseq/mt/domains/medicine/output/test.twist_update-2.comet
cd fairseq/
sacrebleu mt/domains/medicine/tgt/emea-test.en -i mt/domains/medicine/output/test.twist_update-2.txt -m bleu -b -w 4 -l de-en

cd eval/ROUGE/
bash run.sh  ../../fairseq/summ/scitldr/output/test.twist_update-2.txt  ../../fairseq/summ/scitldr/output/test.twist_update-2.txt    ../../fairseq/summ/scitldr/tgt/test_refs.jsonl   ../../fairseq/summ/scitldr/output/test.twist_update-2.rougeL rougeL

Citation

@misc{kasai2022twist,
  author    = {Jungo Kasai and
               Keisuke Sakaguchi and
               Ronan Le Bras and
               Hao Peng and
               Ximing Lu and
               Dragomir Radev and
               Yejin Choi and
               Noah A. Smith},
  title     = {Twist Decoding: Diverse Generators Guide Each Other},
  year      = {2022},
  url       = {https://arxiv.org/abs/2205.09273},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Twist Decoding: Diverse Generators Guide Each Other

Introduction

Installation

Download Our Models and Data

Decode Domain and Generic Models

Decode Left-to-Right and Right-to-Left Models

Paper Summarization

Evaluate Results

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Twist Decoding: Diverse Generators Guide Each Other

Introduction

Installation

Download Our Models and Data

Decode Domain and Generic Models

Decode Left-to-Right and Right-to-Left Models

Paper Summarization

Evaluate Results

Citation