The code for the research paper Aligning Cross-lingual Entities with Multi-Aspect Information @EMNLP 2019.
*Our code is built on top of GCN-Align.
We stored the embeddings for demo in the directory demo_embd
, and we can evaluate ZH-EN
as follows:
python weighted_concat.py -d demo_embd/pairwise_dump.json -g demo_embd/zh_en_graph_embd.pkl -i data/zh_en/test
bash graph.sh 0 1
bash graph.sh 0 0
We use the package relogic
to derive BERT-based embeddings.
Please install relogic
first:
git submodule init
git submodule update
Also, becasuerelogic
evolves fast, we suggest that you change to the commit d1b5046
:
cd relogic
git checkout d1b5046
Note that the argument --local_rank
in relogic
indicates your gpu id.
bash train_bert.sh 0 zh_en
(Note that you need to stop the training manually.)
bash eval_bert.sh 0 zh_en
python weighted_concat.py --desc relogic/saves/pair_matching/zh_en/pairwise_dump.json --graph graph_ckpt/zh_en_graph_embd.pkl --ill data/zh_en/test
@article{yang2019aligning,
title={Aligning Cross-Lingual Entities with Multi-Aspect Information},
author={Yang, Hsiu-Wei and Zou, Yanyan and Shi, Peng and Lu, Wei and Lin, Jimmy and Sun, Xu},
booktitle={Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing},
year={2019}
}