Trajectory–User Linking via Heterogeneous Preference Graph and Dual-Encoder Mutual Distillation (HPG-DEMD)
If you find our work useful, please cite:
@inproceedings{chen2022MainTUL,
title={Trajectory–User Linking via Heterogeneous Preference Graph and Dual-Encoder Mutual Distillation},
author={},
booktitle={ICDE},
year={2026}
}Preprocessed datasets are available in data/. See our paper for detailed descriptions.
| Dataset | Foursquare_TKY_800 | Foursquare_TKY_400 | Foursquare_NYC_800 | Foursquare_NYC_400 | Weeplaces_800 | Weeplaces_400 |
|---|---|---|---|---|---|---|
| Duration (Days) | 320 | 320 | 317 | 317 | 2,761 | 2,643 |
| #Categories | 239 | 231 | 314 | 304 | 1,373 | 1,171 |
| #POIs | 39,698 | 24,526 | 4,929 | 4,290 | 24,649 | 18,482 |
| #Trajectories | 104,413 | 51,969 | 33,971 | 25,419 | 152,583 | 75,873 |
| Avg. Length | 3.08 | 3.15 | 2.92 | 3.17 | 2.58 | 2.62 |
| Density | 2.44 | 1.95 | 4.28 | 3.15 | 6.49 | 4.37 |
(1) Duration (Days): This represents the total time span of the dataset, calculated as the number of days between the first and the last check-in recorded in the entire dataset. For instance, the Foursquare-NYC dataset spans 317 days.
(2) #Categories: The total number of unique POI categories.
(3) #POIs: The total number of unique Points of Interest (POIs).
(4) #Trajectories: The total number of user check-in sequences.
(5) Avg. Length: It is calculated as the total number of check-ins divided by the total number of trajectories.
(6) Density: Formally, let
$P_v$ be the set of POIs that have been visited by at least one user, and let$U_p$ be the set of distinct users who have visited a specific POI$p$ . The density is then defined as: $$ \text{Density} = \frac{1}{|P_v|} \sum_{p \in P_v} |U_p| $$ where$|P_v|$ is the total number of unique visited POIs, and$|U_p|$ is the number of unique users for POI$p$ .
pip install -r requirements.txtRequirements (requirements.txt):
numpy==2.3.3
pandas==2.3.2
scikit_learn==1.7.2
torch==2.5.1+cu124 # Adjust based on your CUDA version
torch_geometric==2.6.1
tqdm==4.67.1cd project
python main.py --dataset foursquare_tky_400Supported datasets:
foursquare_tky_400, foursquare_tky_800, foursquare_nyc_400, foursquare_nyc_800, weeplaces_400, weeplaces_800
Additional parameters can be configured - see main.py for available options.
Note: Adjust
torchversion in requirements according to your CUDA setup. Our experiments were conducted on RTX3090 GPUs.
For additional resources including:
- Computational cost analysis
- Extended experiments
See: appendix.pdf
For code details, please refer to the code comments.