✨ Towards Temporal Knowledge Graph Alignment in the Wild ✨

—————— Under Review at IEEE TPAMI ——————

📰 Latest News

🆕 Updates	📅 Date	📝 Description
🎉 Code Release	-	HyDRA codebase and datasets now available

📰 Introduction

Temporal Knowledge Graph Alignment in the Wild (TKGA-Wild) addresses a critical challenge in temporal knowledge graph integration. To the best of our knowledge, this is the first work to formally formulate and solve this problem, which we term TKGA-Wild. This task presents unique challenges due to Multi-Scale Temporal Elements (i.e., multi-granular temporal coexistence and temporal span disparity) and Asymmetric Temporal Structures (i.e., heterogeneous temporal structures and temporal structural incompleteness) that are common in real-world scenarios.

We have officially introduced complete and high-quality TKGA-Wild benchmarks and proposed HyDRA, a new paradigm based on multi-scale hypergraph retrieval-augmented generation to systematically address the unique challenges of TKGA-Wild. HyDRA effectively captures complex structural dependencies, models multi-granular temporal features, mitigates temporal disparities, and introduces a new scale-weave synergy mechanism to coordinate information across different temporal scales.

🔥 Key Features

Feature	Icon	Description
Multi-Granularity Temporal Encoding	🔄	Captures temporal information at different scales (year, month, day)
Scale-Adaptive Entity Projection	📐	Adaptive entity projection across different graph scales and dimensions
Multi-Scale Hypergraph Retrieval	🔍	Efficient neural retrieval for hypergraph-based search
Scale-Weave Synergy	🔗	Coordinates information across different temporal scales
State-of-the-Art Performance	📈	Consistently outperforming 28 competitive baselines, achieving up to 43.3% improvement in Hits@1

🏗️ Architecture

HyDRA adopts a multi-scale hypergraph retrieval-augmented generation paradigm, comprising several key stages:

Stage 1: Encoding and Integration 🔄

Stage 2: Scale-Adaptive Entity Projection 📐

Stage 3: Multi-Scale Hypergraph Retrieval 🔍

Stage 4: Multi-Scale Fusion 🔗

📖 For detailed architecture descriptions and theoretical foundations, refer to the accompanying paper.

⚙️ Installation

📋 Prerequisites

First, install dependencies:

pip install -r requirements.txt

📦 Main Dependencies

Package	Version	Purpose
🐍 Python	>= 3.7	Core language (tested on 3.8.10)
🔥 PyTorch	>= 1.10.0	Deep learning framework
🔍 Faiss	>= 1.7.0	Efficient similarity search (CPU/GPU)
📊 NumPy	>= 1.21.0	Numerical computing
🐼 Pandas	>= 1.3.0	Data manipulation
⏳ Tqdm	>= 4.62.0	Progress bars
🌐 NetworkX	>= 2.6.0	Graph analysis

💡 Note: For GPU-accelerated FAISS, use faiss-gpu instead of faiss-cpu.

📦 Datasets

For our newly proposed TKGA-Wild scenario, we introduce two novel benchmark datasets: BETA and WildBETA.

Dataset	Description	Fact Size
BETA	Benchmark dataset for TKGA-Wild	362K+
WildBETA	Extended benchmark dataset for TKGA-Wild	563K+

🔗 Download Links

🔐 Baidu Netdisk: Extraction Code: pnax | Password: tkgawild

Dataset Format:

Take the dataset icews_wiki as an example, the folder data/icews_wiki/ should contain:

ent_ids_1: Entity IDs in source KG
ent_ids_2: Entity IDs in target KG
triples_1: Relation triples encoded by IDs in source KG
triples_2: Relation triples encoded by IDs in target KG
rel_ids_1: Relation IDs in the source KG
rel_ids_2: Relation IDs in the target KG
time_id: Time IDs in the source KG and the target KG
ref_ent_ids: All aligned entity pairs, list of pairs like (e_s \t e_t)

Note: The representative datasets used in experiments are derived from Dual-AMN, JAPE, GCN-Align, BETA, DAEA, AGROLD, DOREMUS and related works.

🚀 Quick Start

Step 1: Clone the Repository 📥

git clone https://github.com/eduzrh/HyDRA.git

cd HyDRA

Step 2: Prepare Datasets 📦

Download and extract datasets to ./data/

Step 3: Run the Main Experiment ▶️

python HyDRA_main.py --data_dir data/WildBETA

Step 4: View Results 📊

Metric	Description
Hits@1	Proportion of correct alignments ranked first
Hits@10	Proportion in top-10 candidates
MRR	Mean Reciprocal Rank

📖 Usage

Basic Usage

Run complete pipeline:

python HyDRA_main.py --data_dir data/WildBETA

Advanced Options

Configure training parameters:

python HyDRA_main.py --data_dir data/WildBETA \

    --cuda 0 \

    --epochs 1500 \

    --max_iterations 5 \

    --min_kg1_entities 100

Parameter Descriptions:

Parameter	Type	Default	Description
`--data_dir`	str	Required	Path to dataset directory
`--cuda`	int	0	CUDA device ID for training
`--epochs`	int	500	Number of training epochs for encoding stage
`--max_iterations`	int	3	Maximum pipeline iterations
`--min_kg1_entities`	int	50	Minimum entities threshold for stopping

Multi-Granularity Time Modeling

HyDRA supports multi-granularity temporal modeling (year and month levels) to handle Multi-Granular Temporal Coexistence. This feature can be enabled through the encoding stage configuration.

🔬 Reproducibility

We are committed to ensuring full reproducibility of our results. The following resources are provided:

📋 Experimental Configuration

Hyperparameters: All hyperparameter settings are documented in the code and can be configured via command-line arguments
Random Seeds: Seed configurations are embedded in the training scripts for reproducibility
Environment: Tested on Python 3.8.10 with dependencies as specified in requirements.txt

📊 Reproducing Main Results

To reproduce the main experimental results reported in the paper:

Download datasets following the format described in the Datasets section
Run the complete pipeline with default settings:

python HyDRA_main.py --data_dir data/WildBETA

Evaluate results using the output files in data/icews_wiki/message_pool/

🏗️ Code Organization

The codebase is organized into modular components for clarity:

encoding_and_integration/: Multi-granularity temporal entity encoding and integration
scale_adaptive_entity_projection/: Relation alignment and entity projection
multi_scale_hypergraph_retrieval/: Neural retrieval and hypergraph decomposition
multi_scale_fusion/: Multi-scale fusion and alignment refinement
HyDRA_main.py: Main pipeline orchestrator

📝 Documentation

Comprehensive inline code comments explaining key design decisions
Clear module structure with standardized naming conventions
This README with step-by-step usage instructions

📊 Evaluation Metrics

We employ standard knowledge graph alignment metrics for transparency and comparability:

Hits@1: Proportion of correct alignments ranked first
Hits@10: Proportion of correct alignments in top-10 candidates
MRR (Mean Reciprocal Rank): Average reciprocal rank of correct alignments

📜 License

MIT License - Copyright notices preserved.

📬 Contact

Email: runhaozhao@nudt.edu.cn
GitHub Issues: For technical concerns, create an Issue in the GitHub repository. Labels: bug, enhancement, question.

Responses targeted within 2-3 business days.

📑 Citation

If you find this work helpful for your research or applications, we would appreciate it if you could cite the following paper:

@article{DBLP:journals/corr/abs-2507-14475,
  author       = {Runhao Zhao and
                  Weixin Zeng and
                  Wentao Zhang and
                  Xiang Zhao and
                  Jiuyang Tang and
                  Lei Chen},
  title        = {Towards Temporal Knowledge Graph Alignment in the Wild},
  journal      = {CoRR},
  volume       = {abs/2507.14475},
  year         = {2025},
  url          = {https://doi.org/10.48550/arXiv.2507.14475},
  doi          = {10.48550/ARXIV.2507.14475},
  eprinttype   = {arXiv},
  eprint       = {2507.14475}
}

🔗 References

Unsupervised Entity Alignment for Temporal Knowledge Graphs. Xiaoze Liu, Junyang Wu, Tianyi Li, Lu Chen, and Yunjun Gao. Proceedings of the ACM Web Conference (WWW), 2023.
BERT-INT: A BERT-based Interaction Model for Knowledge Graph Alignment. Xiaobin Tang, Jing Zhang, Bo Chen, Yang Yang, Hong Chen, and Cuiping Li. Journal of Artificial Intelligence Research, 2020.
Benchmarking Challenges for Temporal Knowledge Graph Alignment. Weixin Zeng, Jie Zhou, and Xiang Zhao. Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM), 2024.
Cross-lingual Knowledge Graph Alignment via Graph Convolutional Networks. Zhichun Wang, Qingsong Lv, Xiaohan Lan, and Yu Zhang. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018.
Boosting the Speed of Entity Alignment 10×: Dual Attention Matching Network with Normalized Hard Sample Mining. Xin Mao, Wenting Wang, Yuanbin Wu, and Man Lan. Proceedings of the Web Conference (WWW), 2021.
Wikidata: A Free Collaborative Knowledgebase. Denny Vrandecic and Markus Krötzsch. Communications of the ACM, 2014.
Toward Practical Entity Alignment Method Design: Insights from New Highly Heterogeneous Knowledge Graph Datasets. Xuhui Jiang, Chengjin Xu, Yinghan Shen, Yuanzhuo Wang, Fenglong Su, Zhichao Shi, Fei Sun, Zixuan Li, Jian Guo, and Huawei Shen. Proceedings of the ACM Web Conference (WWW), 2024.
Unlocking the Power of Large Language Models for Entity Alignment. Xuhui Jiang, Yinghan Shen, Zhichao Shi, Chengjin Xu, Wei Li, Zixuan Li, Jian Guo, Huawei Shen, and Yuanzhuo Wang. Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2024.
Bootstrapping Entity Alignment with Knowledge Graph Embedding. Zequn Sun, Wei Hu, Qingheng Zhang, and Yuzhong Qu. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2018.
NetworkX: Network Analysis in Python. NetworkX Developers. GitHub Repository.
Faiss: A Library for Efficient Similarity Search and Clustering of Dense Vectors. Facebook Research. GitHub Repository.
DAEA: Enhancing Entity Alignment in Real-World Knowledge Graphs Through Multi-Source Domain Adaptation Linyan Yang, Shiqiao Zhou, Jingwei Cheng, Fu Zhang, Jizheng Wan, Shuo Wang, Mark Lee. COLING 2025
TGB 2.0: A Benchmark for Learning on Temporal Knowledge Graphs and Heterogeneous Graphs Julia Gastinger, Shenyang Huang, Mikhail Galkin, Erfan Loghmani, Ali Parviz, Farimah Poursafaei, Jacob Danovitch, Emanuele Rossi, Ioannis Koutis, Heiner Stuckenschmidt, Reihaneh Rabbany, Guillaume Rabusseau. NeurIPS 2024 Track on Datasets and Benchmarks

🙏 Acknowledgement

The following open source projects were partially referenced in this work. We sincerely appreciate their contributions:

Dual-AMN, JAPE, GCN-Align, Simple-HHEA, BETA, Dual-Match, Faiss, NetworkX, AdaCoAgentEA, DAEA, AGROLD, DOREMUS

This repository corresponds to the paper Towards Temporal Knowledge Graph Alignment in the Wild (under review at IEEE TPAMI), and is an extension of our previous work BETA.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
data		data
multi_scale_fusion		multi_scale_fusion
multi_scale_hypergraph_retrieval		multi_scale_hypergraph_retrieval
scale_adaptive_entity_projection		scale_adaptive_entity_projection
thread		thread
HyDRA_main.py		HyDRA_main.py
LICENSE		LICENSE
README.md		README.md
README_zh_CN.md		README_zh_CN.md
Technical_Report.pdf		Technical_Report.pdf
__init__.py		__init__.py
ablation_config.py		ablation_config.py
encoding_and_integration.zip		encoding_and_integration.zip
eva_ab.py		eva_ab.py
requirements.txt		requirements.txt
tokens_cal.py		tokens_cal.py

Folders and files

Latest commit

History

Repository files navigation

✨ Towards Temporal Knowledge Graph Alignment in the Wild ✨

—————— Under Review at IEEE TPAMI ——————

📰 Latest News

📰 Introduction

🔥 Key Features

🏗️ Architecture

⚙️ Installation

📋 Prerequisites

📦 Main Dependencies

📦 Datasets

🔗 Download Links

🚀 Quick Start

Step 1: Clone the Repository 📥

Step 2: Prepare Datasets 📦

Step 3: Run the Main Experiment ▶️

Step 4: View Results 📊

📖 Usage

Basic Usage

Advanced Options

Multi-Granularity Time Modeling

🔬 Reproducibility

📋 Experimental Configuration

📊 Reproducing Main Results

🏗️ Code Organization

📝 Documentation

📊 Evaluation Metrics

📜 License

📬 Contact

📑 Citation

🔗 References

🙏 Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages