Implementation of our paper pubslished in EMNLP 2022
Transfer learning is a simple and powerful method to boost the model performance of low-resource neural machine translation (NMT). Existing transfer learning methods for NMT are static, which simply transfer the knowledge from a parent model to a child model once and for all via parameter initialization. In this paper, we instead propose a novel transfer learning method for NMT, namely ConsistTL, which can continuously transfer parent knowledge during the whole training of the child model. Specifically, for each training instance of the child model, ConsistTL constructs the semantically-equivalent instance for the parent model, and encourages the prediction consistency between the parent and child for this instance, which is equivalent to the child model learning each instance under the guidance of the parent model.
cd ConsisTL
pip install --editable .
cd ..
# python>=3.7
# We don't need to install pytorch individually.
# download and preprocess student data
mkdir tr_en
cd tr_en
# donwload tr-en from https://drive.google.com/file/d/1B23gkfQ3O430KSGVRCqTLyjPO01A5e6L/view?usp=sharing
# raw tr-en can be downloaded from https://opus.nlpl.eu/download.php?f=SETIMES/v2/moses/en-tr.txt.zip
cd ..
fairseq-preprocess -s tr -t en --trainpref tr_en/pack_clean/train --validpref tr_en/pack_clean/valid --testpref tr_en/pack_clean/test --srcdict tr_en/dict.tr.txt --tgtdict dict.en.txt --workers 10 --destdir ${STUDENT_DATA}
# download and preprocess teacher data
mkdir de_en
cd de_en
#donwload de-en from https://drive.google.com/file/d/15CXWVj0NIMjDjxEfPCw2WktoYADUuX8O/view?usp=sharing
cd ..
fairseq-preprocess -s de -t en --trainpref de_en/pack_clean/train --validpref de_en/pack_clean/valid --testpref de_en/pack_clean/test --joined-dictionary --destdir ${TEACHER_DATA} --workers 10
cd full_process_scripts
# train two parent model
## train for en-de
### path of binarized parent model training data
BIN_TEACHER_DATA=${BIN_TEACHER_DATA}
bash train_parent.sh en de $BIN_TEACHER_DATA
## train for de-en
bash train_parent.sh de en $BIN_TEACHER_DATA
#gen synthetic de-en for tr-en
## English sentences in child data
CHILD_EN=${CHILD_EN}
## path of trained reversed teacher checkpoint
REVERSED_TEACHER_CHECKPOINT=${REVERSED_TEACHER_CHECKPOINT}
## auxiliary source
AUX_SRC_BIN=${AUX_SRC_BIN}
bash gen.sh $CHILD_EN $BIN_TEACHER_DATA $REVERSED_TEACHER_CHECKPOINT $AUX_SRC_BIN
#switch checkpoint
## path of initialized checkpoint
INIT_CHECKPOINT=${INIT_CHECKPOINT}
## path of student data
BIN_STUDENT_DATA=${BIN_STUDENT_DATA}
## path of teacher checkpoint
python ../ConsisTL/preprocessing_scripts/TM.py --checkpoint $TEACHER_CHECKPOINT --output $INIT_CHECKPOINT --parent-dict $BIN_TEACHER_DATA/dict.de.txt --child-dict $BIN_STUDENT_DATA/dict.tr.txt --switch-dict src
# train for TM-TL
bash train.sh $STUDENT_DATA $INIT_CHECKPOINT
# train for ConsisTL
bash ConsisTL.sh $PREFIX-bin $TEACHER_CHECKPOINT $TEACHER_DATA $STUDENT_DATA $INIT_CHECKPOINT