📰 Introduction |
🏗️ Architecture |
⚙️ Installation |
🚀 Quick Start
📦 Datasets |
📖 Usage |
🔬 Reproducibility |
📜 License |
📬 Contact
📑 Citation
| 🆕 Updates | 📅 Date | 📝 Description |
|---|---|---|
| 🎉 Code Release | - | HyDRA codebase and datasets now available |
Temporal Knowledge Graph Alignment in the Wild (TKGA-Wild) addresses a critical challenge in temporal knowledge graph integration. To the best of our knowledge, this is the first work to formally formulate and solve this problem, which we term TKGA-Wild. This task presents unique challenges due to Multi-Scale Temporal Elements (i.e., multi-granular temporal coexistence and temporal span disparity) and Asymmetric Temporal Structures (i.e., heterogeneous temporal structures and temporal structural incompleteness) that are common in real-world scenarios.
We have officially introduced complete and high-quality TKGA-Wild benchmarks and proposed HyDRA, a new paradigm based on multi-scale hypergraph retrieval-augmented generation to systematically address the unique challenges of TKGA-Wild. HyDRA effectively captures complex structural dependencies, models multi-granular temporal features, mitigates temporal disparities, and introduces a new scale-weave synergy mechanism to coordinate information across different temporal scales.
| Feature | Icon | Description |
|---|---|---|
| Multi-Granularity Temporal Encoding | 🔄 | Captures temporal information at different scales (year, month, day) |
| Scale-Adaptive Entity Projection | 📐 | Adaptive entity projection across different graph scales and dimensions |
| Multi-Scale Hypergraph Retrieval | 🔍 | Efficient neural retrieval for hypergraph-based search |
| Scale-Weave Synergy | 🔗 | Coordinates information across different temporal scales |
| State-of-the-Art Performance | 📈 | Consistently outperforming 28 competitive baselines, achieving up to 43.3% improvement in Hits@1 |
HyDRA adopts a multi-scale hypergraph retrieval-augmented generation paradigm, comprising several key stages:
Stage 1: Encoding and Integration 🔄
Stage 2: Scale-Adaptive Entity Projection 📐
Stage 3: Multi-Scale Hypergraph Retrieval 🔍
Stage 4: Multi-Scale Fusion 🔗
📖 For detailed architecture descriptions and theoretical foundations, refer to the accompanying paper.
First, install dependencies:
pip install -r requirements.txt
| Package | Version | Purpose |
|---|---|---|
| 🐍 Python | >= 3.7 | Core language (tested on 3.8.10) |
| 🔥 PyTorch | >= 1.10.0 | Deep learning framework |
| 🔍 Faiss | >= 1.7.0 | Efficient similarity search (CPU/GPU) |
| 📊 NumPy | >= 1.21.0 | Numerical computing |
| 🐼 Pandas | >= 1.3.0 | Data manipulation |
| ⏳ Tqdm | >= 4.62.0 | Progress bars |
| 🌐 NetworkX | >= 2.6.0 | Graph analysis |
💡 Note: For GPU-accelerated FAISS, use
faiss-gpuinstead offaiss-cpu.
For our newly proposed TKGA-Wild scenario, we introduce two novel benchmark datasets: BETA and WildBETA.
| Dataset | Description | Fact Size |
|---|---|---|
| BETA | Benchmark dataset for TKGA-Wild | 362K+ |
| WildBETA | Extended benchmark dataset for TKGA-Wild | 563K+ |
🔐 Baidu Netdisk: Extraction Code:
pnax| Password:tkgawild
Dataset Format:
Take the dataset icews_wiki as an example, the folder data/icews_wiki/ should contain:
-
ent_ids_1: Entity IDs in source KG -
ent_ids_2: Entity IDs in target KG -
triples_1: Relation triples encoded by IDs in source KG -
triples_2: Relation triples encoded by IDs in target KG -
rel_ids_1: Relation IDs in the source KG -
rel_ids_2: Relation IDs in the target KG -
time_id: Time IDs in the source KG and the target KG -
ref_ent_ids: All aligned entity pairs, list of pairs like(e_s \t e_t)
Note: The representative datasets used in experiments are derived from Dual-AMN, JAPE, GCN-Align, BETA, DAEA, AGROLD, DOREMUS and related works.
git clone https://github.com/eduzrh/HyDRA.git
cd HyDRA
Download and extract datasets to ./data/
python HyDRA_main.py --data_dir data/WildBETA
| Metric | Description |
|---|---|
| Hits@1 | Proportion of correct alignments ranked first |
| Hits@10 | Proportion in top-10 candidates |
| MRR | Mean Reciprocal Rank |
Run complete pipeline:
python HyDRA_main.py --data_dir data/WildBETA
Configure training parameters:
python HyDRA_main.py --data_dir data/WildBETA \
--cuda 0 \
--epochs 1500 \
--max_iterations 5 \
--min_kg1_entities 100
Parameter Descriptions:
| Parameter | Type | Default | Description |
|---|---|---|---|
--data_dir |
str | Required | Path to dataset directory |
--cuda |
int | 0 | CUDA device ID for training |
--epochs |
int | 500 | Number of training epochs for encoding stage |
--max_iterations |
int | 3 | Maximum pipeline iterations |
--min_kg1_entities |
int | 50 | Minimum entities threshold for stopping |
HyDRA supports multi-granularity temporal modeling (year and month levels) to handle Multi-Granular Temporal Coexistence. This feature can be enabled through the encoding stage configuration.
We are committed to ensuring full reproducibility of our results. The following resources are provided:
-
Hyperparameters: All hyperparameter settings are documented in the code and can be configured via command-line arguments
-
Random Seeds: Seed configurations are embedded in the training scripts for reproducibility
-
Environment: Tested on Python 3.8.10 with dependencies as specified in
requirements.txt
To reproduce the main experimental results reported in the paper:
-
Download datasets following the format described in the Datasets section
-
Run the complete pipeline with default settings:
python HyDRA_main.py --data_dir data/WildBETA
- Evaluate results using the output files in
data/icews_wiki/message_pool/
The codebase is organized into modular components for clarity:
-
encoding_and_integration/: Multi-granularity temporal entity encoding and integration -
scale_adaptive_entity_projection/: Relation alignment and entity projection -
multi_scale_hypergraph_retrieval/: Neural retrieval and hypergraph decomposition -
multi_scale_fusion/: Multi-scale fusion and alignment refinement -
HyDRA_main.py: Main pipeline orchestrator
-
Comprehensive inline code comments explaining key design decisions
-
Clear module structure with standardized naming conventions
-
This README with step-by-step usage instructions
We employ standard knowledge graph alignment metrics for transparency and comparability:
-
Hits@1: Proportion of correct alignments ranked first
-
Hits@10: Proportion of correct alignments in top-10 candidates
-
MRR (Mean Reciprocal Rank): Average reciprocal rank of correct alignments
MIT License - Copyright notices preserved.
-
Email: runhaozhao@nudt.edu.cn
-
GitHub Issues: For technical concerns, create an Issue in the GitHub repository. Labels:
bug,enhancement,question.
Responses targeted within 2-3 business days.
If you find this work helpful for your research or applications, we would appreciate it if you could cite the following paper:
@article{DBLP:journals/corr/abs-2507-14475,
author = {Runhao Zhao and
Weixin Zeng and
Wentao Zhang and
Xiang Zhao and
Jiuyang Tang and
Lei Chen},
title = {Towards Temporal Knowledge Graph Alignment in the Wild},
journal = {CoRR},
volume = {abs/2507.14475},
year = {2025},
url = {https://doi.org/10.48550/arXiv.2507.14475},
doi = {10.48550/ARXIV.2507.14475},
eprinttype = {arXiv},
eprint = {2507.14475}
}- Unsupervised Entity Alignment for Temporal Knowledge Graphs. Xiaoze Liu, Junyang Wu, Tianyi Li, Lu Chen, and Yunjun Gao. Proceedings of the ACM Web Conference (WWW), 2023.
- BERT-INT: A BERT-based Interaction Model for Knowledge Graph Alignment. Xiaobin Tang, Jing Zhang, Bo Chen, Yang Yang, Hong Chen, and Cuiping Li. Journal of Artificial Intelligence Research, 2020.
- Benchmarking Challenges for Temporal Knowledge Graph Alignment. Weixin Zeng, Jie Zhou, and Xiang Zhao. Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM), 2024.
- Cross-lingual Knowledge Graph Alignment via Graph Convolutional Networks. Zhichun Wang, Qingsong Lv, Xiaohan Lan, and Yu Zhang. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018.
- Boosting the Speed of Entity Alignment 10×: Dual Attention Matching Network with Normalized Hard Sample Mining. Xin Mao, Wenting Wang, Yuanbin Wu, and Man Lan. Proceedings of the Web Conference (WWW), 2021.
- Wikidata: A Free Collaborative Knowledgebase. Denny Vrandecic and Markus Krötzsch. Communications of the ACM, 2014.
- Toward Practical Entity Alignment Method Design: Insights from New Highly Heterogeneous Knowledge Graph Datasets. Xuhui Jiang, Chengjin Xu, Yinghan Shen, Yuanzhuo Wang, Fenglong Su, Zhichao Shi, Fei Sun, Zixuan Li, Jian Guo, and Huawei Shen. Proceedings of the ACM Web Conference (WWW), 2024.
- Unlocking the Power of Large Language Models for Entity Alignment. Xuhui Jiang, Yinghan Shen, Zhichao Shi, Chengjin Xu, Wei Li, Zixuan Li, Jian Guo, Huawei Shen, and Yuanzhuo Wang. Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2024.
- Bootstrapping Entity Alignment with Knowledge Graph Embedding. Zequn Sun, Wei Hu, Qingheng Zhang, and Yuzhong Qu. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2018.
- NetworkX: Network Analysis in Python. NetworkX Developers. GitHub Repository.
- Faiss: A Library for Efficient Similarity Search and Clustering of Dense Vectors. Facebook Research. GitHub Repository.
- DAEA: Enhancing Entity Alignment in Real-World Knowledge Graphs Through Multi-Source Domain Adaptation Linyan Yang, Shiqiao Zhou, Jingwei Cheng, Fu Zhang, Jizheng Wan, Shuo Wang, Mark Lee. COLING 2025
- TGB 2.0: A Benchmark for Learning on Temporal Knowledge Graphs and Heterogeneous Graphs Julia Gastinger, Shenyang Huang, Mikhail Galkin, Erfan Loghmani, Ali Parviz, Farimah Poursafaei, Jacob Danovitch, Emanuele Rossi, Ioannis Koutis, Heiner Stuckenschmidt, Reihaneh Rabbany, Guillaume Rabusseau. NeurIPS 2024 Track on Datasets and Benchmarks
The following open source projects were partially referenced in this work. We sincerely appreciate their contributions:
Dual-AMN, JAPE, GCN-Align, Simple-HHEA, BETA, Dual-Match, Faiss, NetworkX, AdaCoAgentEA, DAEA, AGROLD, DOREMUS
This repository corresponds to the paper Towards Temporal Knowledge Graph Alignment in the Wild (under review at IEEE TPAMI), and is an extension of our previous work BETA.