This repository is a fork of Sun et. al implementation of four knowledge graph embedding models. Here we apply the aforementioned algorithms to a biomedical knowledge graph called MIND (MechRepoNet with DrugCentral indications). We report the results of our analysis in this preprint.
- Added code to output raw embeddings in order to extract predictions. This can be seen with the
--do_predict
flag incodes/run.py
. - Added Notebooks folder that encapsulates analysis done on the MIND dataset.
- Added methods,
Notebooks/score_utils.py
, to process and translate raw embeddings into human readable entities and relations.
- Please see the original PyTorch implementation instructions
- Download the MIND dataset to
./data
- Install requirements into python virtual environment
# run in shell mamba create -f environment.yml mamba activate kge
- Train/Test
# run in shell bash run.sh train <model_name> <dataset_name> <gpu_num> <folder_out_name> <batch size> <neg_sample_size> <dimensions> <gamma> <alpha> <learningrate> <test_batch_size> <double_entities_emb> <double_relation_emb> <regularization> # or in python # for more parameters please see codes/run.py Lines 23 - 72 python run.py --{do_train, do_valid, do_test, do_predict} --data_path <where/data/is> --model {TransE, DistMult, ComplEx, RotatE}
If you use the codes, please cite the original paper by Sun et al:
@inproceedings{
sun2018rotate,
title={RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space},
author={Zhiqing Sun and Zhi-Hong Deng and Jian-Yun Nie and Jian Tang},
booktitle={International Conference on Learning Representations},
year={2019},
url={https://openreview.net/forum?id=HkgEQnRqYQ},
}