Code for the paper "Temporal Causal-based Simulation for Realistic Time-series Generation", Gkorgkolis et al., 2025.
-
Problem: Existing works on generating time-series data and their corresponding causal graphs often assume overly simplistic or closed-world simulation settings, evaluating generated datasets using unoptimized or single-metric approaches (e.g., MMD) which can be highly misleading and fail to reflect true data quality.
-
Contributions:
- Demonstrate that relying on unoptimized metrics for data quality assessment leads to unreliable conclusions (see Figure 1 of our paper).
- Introduce a modular, model-agnostic pipeline for simulating realistic time-series data along with their time-lagged causal graphs.
- Propose a Min-max AutoML scheme that selects the best simulation configuration using optimized classifier two-sample tests (C2STs), by minimizing over configurations
$c \in C$ and maximizing over discriminators$d \in D$ (illustrated in the main figure). - Show that our method achieves comparable or superior generation across a diverse set of real, semi-synthetic, and synthetic time-series datasets.
Create a virtual conda environment using
conda env create -f environment.yamlconda activate TCS
Alternatively, you can just install the dependencies from the requirements.txt file, either on your base environment or into an existing conda environment using
pip install -r requirements.txt
Notebooks for reproducible experiments and demo scripts (running_examples.ipynb) are available in the code/notebooks/ folder. Experimental results are available in code/data/results/.
We provide various .ipynb notebooks not only for reproducing the experimental results of the paper but also for getting started with our codebase. Specifically:
exp_0_increasing_density.ipynbcontains experiments on the impact of using the sparsity penalty in the simulation on synthetic data against the number of edgesex_1_dense_output.ipynbcontains an experiment on using a dense graph as input to the TCS algorithm from the 1st phase of TCS. It also contains our experimental results on using the ground truth graph (oracle graph) with the sparsity penalty (see Figures 3a and 3b of our paper)exp_2_oracle_graph.ipynbillustrates the behavior of TCS given the oracle graph as the 1st phase's outputexp_3_vs_baselines.ipynbcontains baseline comparisons between TCS, CausalTime and non-causal simulators (CPAR, TVAE) (Table 3 of our paper)exp_4_cd_efficacy.ipynbcorresponds to our CD Efficacy experiments (Table 2 of our paper)running_examples.ipynbrepresents two running examples of the TCS codebase: (i) one running a single TCS simulation with a configuration of PCMCI Causal Discovery algorithm, ADDSTCN (TCDF) predictor and spline noise estimators and (ii) an optimized TCS simulation with our proposed Min-max selection scheme.
βββ code
β βββ CausalTime
β β βββ dataloader.py
β β βββ demo.py
β β βββ generate.py
β β βββ __init__.py
β β βββ models.py
β β βββ test.py
β β βββ tools.py
β β βββ train.py
β β βββ utilities.py
β β βββ visualization.py
β βββ cd_methods
β β βββ CausalPretraining
β β β βββ helpers
β β β β βββ __init__.py
β β β β βββ tools.py
β β β βββ __init__.py
β β β βββ model
β β β βββ conv.py
β β β βββ gru.py
β β β βββ informer.py
β β β βββ __init__.py
β β β βββ mlp.py
β β β βββ model_wrapper.py
β β βββ DynoTears
β β β βββ causalnex
β β β β βββ __init__.py
β β β β βββ README.md
β β β β βββ structure
β β β β βββ categorical_variable_mapper.py
β β β β βββ dynotears.py
β β β β βββ __init__.py
β β β β βββ notears.py
β β β β βββ structure_model.py
β β β β βββ transformers.py
β β β βββ __init__.py
β β β βββ utils.py
β β βββ __init__.py
β βββ notebooks
β β βββ exp_0_increasing_density.ipynb
β β βββ exp_1_dense_output.ipynb
β β βββ exp_2_oracle_graph.ipynb
β β βββ exp_3_vs_baselines.ipynb
β β βββ exp_4_cd_efficacy.ipynb
β β βββ running_examples.ipynb
β βββ PretrainedForecasters
β β βββ __init__.py
β β βββ TimesFMForecaster.py
β βββ RealNVP
β β βββ __init__.py
β β βββ RealNVP.py
β β βββ RealNVP_pytorch.py
β βββ simulation
β β βββ delong.py
β β βββ detection_lstm.py
β β βββ __init__.py
β β βββ simulation_configs.py
β β βββ simulation_extra.py
β β βββ simulation_metrics.py
β β βββ simulation_tools.py
β β βββ simulation_utils.py
β βββ TCDF
β β βββ depthwise.py
β β βββ forecaster.py
β β βββ __init__.py
β β βββ model.py
β β βββ TCDF.py
β βββ tempogen
β β βββ functional_utils.py
β β βββ __init__.py
β β βββ temporal_causal_structure.py
β β βββ temporal_node.py
β β βββ temporal_random_generation.py
β β βββ temporal_scm.py
β βββ utils.py
βββ data
β βββ cp_style
β β βββ increasing_edges_cp_1
β β βββ data
β β β βββ (000)_cp_v10_l1_p95_ts.csv
β β β βββ (001)_cp_v10_l1_p92_ts.csv
β β β βββ (002)_cp_v10_l1_p89_ts.csv
β β β βββ (003)_cp_v10_l1_p86_ts.csv
β β β βββ (004)_cp_v10_l1_p83_ts.csv
β β β βββ (005)_cp_v10_l1_p80_ts.csv
β β β βββ (006)_cp_v10_l1_p77_ts.csv
β β β βββ (007)_cp_v10_l1_p74_ts.csv
β β β βββ (008)_cp_v10_l1_p71_ts.csv
β β β βββ (009)_cp_v10_l1_p68_ts.csv
β β β βββ (010)_cp_v10_l1_p65_ts.csv
β β β βββ (011)_cp_v10_l2_p98_ts.csv
β β β βββ (012)_cp_v10_l2_p95_ts.csv
β β β βββ (013)_cp_v10_l2_p92_ts.csv
β β β βββ (014)_cp_v10_l2_p89_ts.csv
β β β βββ (015)_cp_v10_l2_p86_ts.csv
β β β βββ (016)_cp_v10_l2_p83_ts.csv
β β β βββ (017)_cp_v10_l2_p80_ts.csv
β β β βββ (018)_cp_v10_l2_p77_ts.csv
β β βββ structure
β β βββ (000)_cp_v10_l1_p95_struct.pt
β β βββ (001)_cp_v10_l1_p92_struct.pt
β β βββ (002)_cp_v10_l1_p89_struct.pt
β β βββ (003)_cp_v10_l1_p86_struct.pt
β β βββ (004)_cp_v10_l1_p83_struct.pt
β β βββ (005)_cp_v10_l1_p80_struct.pt
β β βββ (006)_cp_v10_l1_p77_struct.pt
β β βββ (007)_cp_v10_l1_p74_struct.pt
β β βββ (008)_cp_v10_l1_p71_struct.pt
β β βββ (009)_cp_v10_l1_p68_struct.pt
β β βββ (010)_cp_v10_l1_p65_struct.pt
β β βββ (011)_cp_v10_l2_p98_struct.pt
β β βββ (012)_cp_v10_l2_p95_struct.pt
β β βββ (013)_cp_v10_l2_p92_struct.pt
β β βββ (014)_cp_v10_l2_p89_struct.pt
β β βββ (015)_cp_v10_l2_p86_struct.pt
β β βββ (016)_cp_v10_l2_p83_struct.pt
β β βββ (017)_cp_v10_l2_p80_struct.pt
β β βββ (018)_cp_v10_l2_p77_struct.pt
β βββ finance
β β βββ random-rels_20_1_3_returns30007000_header.csv
β β βββ random-rels_20_1A_returns30007000_header.csv
β β βββ random-rels_20_1B_returns30007000_header.csv
β β βββ random-rels_20_1C_returns30007000_header.csv
β β βββ random-rels_20_1D_returns30007000_header.csv
β β βββ random-rels_20_1E_returns30007000_header.csv
β βββ fMRI
β β βββ graphs
β β β βββ sim19_gt_processed.csv
β β β βββ sim20_gt_processed.csv
β β β βββ sim5_gt_processed.csv
β β β βββ sim6_gt_processed.csv
β β β βββ sim7_gt_processed.csv
β β β βββ sim9_gt_processed.csv
β β βββ timeseries
β β βββ timeseries19.csv
β β βββ timeseries20.csv
β β βββ timeseries5.csv
β β βββ timeseries6.csv
β β βββ timeseries7.csv
β β βββ timeseries9.csv
β βββ MvTS
β β βββ air_quality_mini
β β β βββ air_quality_mini_boot_0.csv
β β β βββ air_quality_mini_boot_1.csv
β β β βββ air_quality_mini_boot_2.csv
β β β βββ air_quality_mini_boot_3.csv
β β β βββ air_quality_mini_boot_4.csv
β β βββ AirQualityUCI
β β β βββ AirQualityUCI_boot_0.csv
β β β βββ AirQualityUCI_boot_1.csv
β β β βββ AirQualityUCI_boot_2.csv
β β β βββ AirQualityUCI_boot_3.csv
β β β βββ AirQualityUCI_boot_4.csv
β β βββ bike-usage
β β β βββ bike-usage_boot_0.csv
β β β βββ bike-usage_boot_1.csv
β β β βββ bike-usage_boot_2.csv
β β β βββ bike-usage_boot_3.csv
β β β βββ bike-usage_boot_5.csv
β β βββ ETTh1
β β β βββ ETTh1_boot_0.csv
β β β βββ ETTh1_boot_1.csv
β β β βββ ETTh1_boot_2.csv
β β β βββ ETTh1_boot_3.csv
β β β βββ ETTh1_boot_4.csv
β β βββ ETTm1
β β β βββ ETTm1_boot_0.csv
β β β βββ ETTm1_boot_1.csv
β β β βββ ETTm1_boot_2.csv
β β β βββ ETTm1_boot_3.csv
β β β βββ ETTm1_boot_4.csv
β β βββ outdoor
β β β βββ outdoor_original.csv
β β βββ WTH
β β βββ WTH_boot_0.csv
β β βββ WTH_boot_1.csv
β β βββ WTH_boot_2.csv
β β βββ WTH_boot_3.csv
β β βββ WTH_boot_4.csv
β βββ results
β βββ dense_graph
β β βββ res_cp_vs_1.p
β β βββ res_cp_vs_2.p
β βββ figures
β β βββ sparsity_penalty_cp1.png
β β βββ sparsity_penalty_cp1_short.png
β βββ oracle_graph
β β βββ res_cp_just_1.p
β β βββ res_cp_ora_1.p
β β βββ res_cp_vs_1.p
β βββ sparsity_penalty
β β βββ res_cp_vs_2.p
β βββ vs
β βββ air_quality_mini_auc.json
β βββ air_quality_mini_mmd.json
β βββ AirQualityUCI_auc.json
β βββ AirQualityUCI_mmd.json
β βββ bike-usage_auc.json
β βββ bike-usage_mmd.json
β βββ cp_1_auc.json
β βββ cp_1_mmd.json
β βββ finance_auc.json
β βββ finance_mmd.json
β βββ fmri_auc.json
β βββ fmri_mmd.json
β βββ outdoor_auc.json
β βββ outdoor_mmd.json
β βββ WTH_auc.json
β βββ WTH_mmd.json
βββ environment.yaml
βββ LICENSE
βββ README.md
βββ requirements.txt
CP Weights (to be optionally included in Phase 1 of TCS -see simulation_configs.py) are provided outside GitHub due to size constraints, in the following Google Drive links:
If the codebase has proven useful, please consider citing the following article:
@misc{gkorgkolis2025temporal,
title={Temporal Causal-based Simulation for Realistic Time-series Generation},
author={Nikolaos Gkorgkolis and Nikolaos Kougioulis and MingXue Wang and Bora Caglayan and Andrea Tonon and Dario Simionato and Ioannis Tsamardinos},
year={2025},
eprint={2506.02084},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2506.02084},
}Contributions are welcome! Feel free to:
- Open issues for bugs, questions, or feature requests
- Submit pull requests for improvements or new functionality
We follow standard GitHub practices for contributions, see our CONTRIBUTING file.
