Skip to content

gkorgkolis/TCS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

71 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Temporal Causal-based Simulation (TCS)

Python 3.10 PyTorch Scikit-learn License arXiv CodeFactor

Code for the paper "Temporal Causal-based Simulation for Realistic Time-series Generation", Gkorgkolis et al., 2025.

πŸ“Œ Overview

  • Problem: Existing works on generating time-series data and their corresponding causal graphs often assume overly simplistic or closed-world simulation settings, evaluating generated datasets using unoptimized or single-metric approaches (e.g., MMD) which can be highly misleading and fail to reflect true data quality.

  • Contributions:

    • Demonstrate that relying on unoptimized metrics for data quality assessment leads to unreliable conclusions (see Figure 1 of our paper).
    • Introduce a modular, model-agnostic pipeline for simulating realistic time-series data along with their time-lagged causal graphs.
    • Propose a Min-max AutoML scheme that selects the best simulation configuration using optimized classifier two-sample tests (C2STs), by minimizing over configurations $c \in C$ and maximizing over discriminators $d \in D$ (illustrated in the main figure).
    • Show that our method achieves comparable or superior generation across a diverse set of real, semi-synthetic, and synthetic time-series datasets.

Installation

🐍 Using Conda

Create a virtual conda environment using

  • conda env create -f environment.yaml
  • conda activate TCS

Install requirements directly

Alternatively, you can just install the dependencies from the requirements.txt file, either on your base environment or into an existing conda environment using

pip install -r requirements.txt

πŸ§ͺ Quick Start

Notebooks for reproducible experiments and demo scripts (running_examples.ipynb) are available in the code/notebooks/ folder. Experimental results are available in code/data/results/.

πŸ“” Available Notebooks

We provide various .ipynb notebooks not only for reproducing the experimental results of the paper but also for getting started with our codebase. Specifically:

  • exp_0_increasing_density.ipynb contains experiments on the impact of using the sparsity penalty in the simulation on synthetic data against the number of edges
  • ex_1_dense_output.ipynb contains an experiment on using a dense graph as input to the TCS algorithm from the 1st phase of TCS. It also contains our experimental results on using the ground truth graph (oracle graph) with the sparsity penalty (see Figures 3a and 3b of our paper)
  • exp_2_oracle_graph.ipynb illustrates the behavior of TCS given the oracle graph as the 1st phase's output
  • exp_3_vs_baselines.ipynb contains baseline comparisons between TCS, CausalTime and non-causal simulators (CPAR, TVAE) (Table 3 of our paper)
  • exp_4_cd_efficacy.ipynb corresponds to our CD Efficacy experiments (Table 2 of our paper)
  • running_examples.ipynb represents two running examples of the TCS codebase: (i) one running a single TCS simulation with a configuration of PCMCI Causal Discovery algorithm, ADDSTCN (TCDF) predictor and spline noise estimators and (ii) an optimized TCS simulation with our proposed Min-max selection scheme.

πŸ“ Structure

β”œβ”€β”€ code
β”‚   β”œβ”€β”€ CausalTime
β”‚   β”‚   β”œβ”€β”€ dataloader.py
β”‚   β”‚   β”œβ”€β”€ demo.py
β”‚   β”‚   β”œβ”€β”€ generate.py
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ models.py
β”‚   β”‚   β”œβ”€β”€ test.py
β”‚   β”‚   β”œβ”€β”€ tools.py
β”‚   β”‚   β”œβ”€β”€ train.py
β”‚   β”‚   β”œβ”€β”€ utilities.py
β”‚   β”‚   └── visualization.py
β”‚   β”œβ”€β”€ cd_methods
β”‚   β”‚   β”œβ”€β”€ CausalPretraining
β”‚   β”‚   β”‚   β”œβ”€β”€ helpers
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”‚   β”‚   └── tools.py
β”‚   β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”‚   └── model
β”‚   β”‚   β”‚       β”œβ”€β”€ conv.py
β”‚   β”‚   β”‚       β”œβ”€β”€ gru.py
β”‚   β”‚   β”‚       β”œβ”€β”€ informer.py
β”‚   β”‚   β”‚       β”œβ”€β”€ __init__.py
β”‚   β”‚   β”‚       β”œβ”€β”€ mlp.py
β”‚   β”‚   β”‚       └── model_wrapper.py
β”‚   β”‚   β”œβ”€β”€ DynoTears
β”‚   β”‚   β”‚   β”œβ”€β”€ causalnex
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ README.md
β”‚   β”‚   β”‚   β”‚   └── structure
β”‚   β”‚   β”‚   β”‚       β”œβ”€β”€ categorical_variable_mapper.py
β”‚   β”‚   β”‚   β”‚       β”œβ”€β”€ dynotears.py
β”‚   β”‚   β”‚   β”‚       β”œβ”€β”€ __init__.py
β”‚   β”‚   β”‚   β”‚       β”œβ”€β”€ notears.py
β”‚   β”‚   β”‚   β”‚       β”œβ”€β”€ structure_model.py
β”‚   β”‚   β”‚   β”‚       └── transformers.py
β”‚   β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”‚   └── utils.py
β”‚   β”‚   └── __init__.py
β”‚   β”œβ”€β”€ notebooks
β”‚   β”‚   β”œβ”€β”€ exp_0_increasing_density.ipynb
β”‚   β”‚   β”œβ”€β”€ exp_1_dense_output.ipynb
β”‚   β”‚   β”œβ”€β”€ exp_2_oracle_graph.ipynb
β”‚   β”‚   β”œβ”€β”€ exp_3_vs_baselines.ipynb
β”‚   β”‚   β”œβ”€β”€ exp_4_cd_efficacy.ipynb
β”‚   β”‚   └── running_examples.ipynb
β”‚   β”œβ”€β”€ PretrainedForecasters
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── TimesFMForecaster.py
β”‚   β”œβ”€β”€ RealNVP
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ RealNVP.py
β”‚   β”‚   └── RealNVP_pytorch.py
β”‚   β”œβ”€β”€ simulation
β”‚   β”‚   β”œβ”€β”€ delong.py
β”‚   β”‚   β”œβ”€β”€ detection_lstm.py
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ simulation_configs.py
β”‚   β”‚   β”œβ”€β”€ simulation_extra.py
β”‚   β”‚   β”œβ”€β”€ simulation_metrics.py
β”‚   β”‚   β”œβ”€β”€ simulation_tools.py
β”‚   β”‚   └── simulation_utils.py
β”‚   β”œβ”€β”€ TCDF
β”‚   β”‚   β”œβ”€β”€ depthwise.py
β”‚   β”‚   β”œβ”€β”€ forecaster.py
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ model.py
β”‚   β”‚   └── TCDF.py
β”‚   β”œβ”€β”€ tempogen
β”‚   β”‚   β”œβ”€β”€ functional_utils.py
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ temporal_causal_structure.py
β”‚   β”‚   β”œβ”€β”€ temporal_node.py
β”‚   β”‚   β”œβ”€β”€ temporal_random_generation.py
β”‚   β”‚   └── temporal_scm.py
β”‚   └── utils.py
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ cp_style
β”‚   β”‚   └── increasing_edges_cp_1
β”‚   β”‚       β”œβ”€β”€ data
β”‚   β”‚       β”‚   β”œβ”€β”€ (000)_cp_v10_l1_p95_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (001)_cp_v10_l1_p92_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (002)_cp_v10_l1_p89_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (003)_cp_v10_l1_p86_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (004)_cp_v10_l1_p83_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (005)_cp_v10_l1_p80_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (006)_cp_v10_l1_p77_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (007)_cp_v10_l1_p74_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (008)_cp_v10_l1_p71_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (009)_cp_v10_l1_p68_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (010)_cp_v10_l1_p65_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (011)_cp_v10_l2_p98_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (012)_cp_v10_l2_p95_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (013)_cp_v10_l2_p92_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (014)_cp_v10_l2_p89_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (015)_cp_v10_l2_p86_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (016)_cp_v10_l2_p83_ts.csv
β”‚   β”‚       β”‚   β”œβ”€β”€ (017)_cp_v10_l2_p80_ts.csv
β”‚   β”‚       β”‚   └── (018)_cp_v10_l2_p77_ts.csv
β”‚   β”‚       └── structure
β”‚   β”‚           β”œβ”€β”€ (000)_cp_v10_l1_p95_struct.pt
β”‚   β”‚           β”œβ”€β”€ (001)_cp_v10_l1_p92_struct.pt
β”‚   β”‚           β”œβ”€β”€ (002)_cp_v10_l1_p89_struct.pt
β”‚   β”‚           β”œβ”€β”€ (003)_cp_v10_l1_p86_struct.pt
β”‚   β”‚           β”œβ”€β”€ (004)_cp_v10_l1_p83_struct.pt
β”‚   β”‚           β”œβ”€β”€ (005)_cp_v10_l1_p80_struct.pt
β”‚   β”‚           β”œβ”€β”€ (006)_cp_v10_l1_p77_struct.pt
β”‚   β”‚           β”œβ”€β”€ (007)_cp_v10_l1_p74_struct.pt
β”‚   β”‚           β”œβ”€β”€ (008)_cp_v10_l1_p71_struct.pt
β”‚   β”‚           β”œβ”€β”€ (009)_cp_v10_l1_p68_struct.pt
β”‚   β”‚           β”œβ”€β”€ (010)_cp_v10_l1_p65_struct.pt
β”‚   β”‚           β”œβ”€β”€ (011)_cp_v10_l2_p98_struct.pt
β”‚   β”‚           β”œβ”€β”€ (012)_cp_v10_l2_p95_struct.pt
β”‚   β”‚           β”œβ”€β”€ (013)_cp_v10_l2_p92_struct.pt
β”‚   β”‚           β”œβ”€β”€ (014)_cp_v10_l2_p89_struct.pt
β”‚   β”‚           β”œβ”€β”€ (015)_cp_v10_l2_p86_struct.pt
β”‚   β”‚           β”œβ”€β”€ (016)_cp_v10_l2_p83_struct.pt
β”‚   β”‚           β”œβ”€β”€ (017)_cp_v10_l2_p80_struct.pt
β”‚   β”‚           └── (018)_cp_v10_l2_p77_struct.pt
β”‚   β”œβ”€β”€ finance
β”‚   β”‚   β”œβ”€β”€ random-rels_20_1_3_returns30007000_header.csv
β”‚   β”‚   β”œβ”€β”€ random-rels_20_1A_returns30007000_header.csv
β”‚   β”‚   β”œβ”€β”€ random-rels_20_1B_returns30007000_header.csv
β”‚   β”‚   β”œβ”€β”€ random-rels_20_1C_returns30007000_header.csv
β”‚   β”‚   β”œβ”€β”€ random-rels_20_1D_returns30007000_header.csv
β”‚   β”‚   └── random-rels_20_1E_returns30007000_header.csv
β”‚   β”œβ”€β”€ fMRI
β”‚   β”‚   β”œβ”€β”€ graphs
β”‚   β”‚   β”‚   β”œβ”€β”€ sim19_gt_processed.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ sim20_gt_processed.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ sim5_gt_processed.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ sim6_gt_processed.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ sim7_gt_processed.csv
β”‚   β”‚   β”‚   └── sim9_gt_processed.csv
β”‚   β”‚   └── timeseries
β”‚   β”‚       β”œβ”€β”€ timeseries19.csv
β”‚   β”‚       β”œβ”€β”€ timeseries20.csv
β”‚   β”‚       β”œβ”€β”€ timeseries5.csv
β”‚   β”‚       β”œβ”€β”€ timeseries6.csv
β”‚   β”‚       β”œβ”€β”€ timeseries7.csv
β”‚   β”‚       └── timeseries9.csv
β”‚   β”œβ”€β”€ MvTS
β”‚   β”‚   β”œβ”€β”€ air_quality_mini
β”‚   β”‚   β”‚   β”œβ”€β”€ air_quality_mini_boot_0.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ air_quality_mini_boot_1.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ air_quality_mini_boot_2.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ air_quality_mini_boot_3.csv
β”‚   β”‚   β”‚   └── air_quality_mini_boot_4.csv
β”‚   β”‚   β”œβ”€β”€ AirQualityUCI
β”‚   β”‚   β”‚   β”œβ”€β”€ AirQualityUCI_boot_0.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ AirQualityUCI_boot_1.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ AirQualityUCI_boot_2.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ AirQualityUCI_boot_3.csv
β”‚   β”‚   β”‚   └── AirQualityUCI_boot_4.csv
β”‚   β”‚   β”œβ”€β”€ bike-usage
β”‚   β”‚   β”‚   β”œβ”€β”€ bike-usage_boot_0.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ bike-usage_boot_1.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ bike-usage_boot_2.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ bike-usage_boot_3.csv
β”‚   β”‚   β”‚   └── bike-usage_boot_5.csv
β”‚   β”‚   β”œβ”€β”€ ETTh1
β”‚   β”‚   β”‚   β”œβ”€β”€ ETTh1_boot_0.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ ETTh1_boot_1.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ ETTh1_boot_2.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ ETTh1_boot_3.csv
β”‚   β”‚   β”‚   └── ETTh1_boot_4.csv
β”‚   β”‚   β”œβ”€β”€ ETTm1
β”‚   β”‚   β”‚   β”œβ”€β”€ ETTm1_boot_0.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ ETTm1_boot_1.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ ETTm1_boot_2.csv
β”‚   β”‚   β”‚   β”œβ”€β”€ ETTm1_boot_3.csv
β”‚   β”‚   β”‚   └── ETTm1_boot_4.csv
β”‚   β”‚   β”œβ”€β”€ outdoor
β”‚   β”‚   β”‚   └── outdoor_original.csv
β”‚   β”‚   └── WTH
β”‚   β”‚       β”œβ”€β”€ WTH_boot_0.csv
β”‚   β”‚       β”œβ”€β”€ WTH_boot_1.csv
β”‚   β”‚       β”œβ”€β”€ WTH_boot_2.csv
β”‚   β”‚       β”œβ”€β”€ WTH_boot_3.csv
β”‚   β”‚       └── WTH_boot_4.csv
β”‚   └── results
β”‚       β”œβ”€β”€ dense_graph
β”‚       β”‚   β”œβ”€β”€ res_cp_vs_1.p
β”‚       β”‚   └── res_cp_vs_2.p
β”‚       β”œβ”€β”€ figures
β”‚       β”‚   β”œβ”€β”€ sparsity_penalty_cp1.png
β”‚       β”‚   └── sparsity_penalty_cp1_short.png
β”‚       β”œβ”€β”€ oracle_graph
β”‚       β”‚   β”œβ”€β”€ res_cp_just_1.p
β”‚       β”‚   β”œβ”€β”€ res_cp_ora_1.p
β”‚       β”‚   └── res_cp_vs_1.p
β”‚       β”œβ”€β”€ sparsity_penalty
β”‚       β”‚   └── res_cp_vs_2.p
β”‚       └── vs
β”‚           β”œβ”€β”€ air_quality_mini_auc.json
β”‚           β”œβ”€β”€ air_quality_mini_mmd.json
β”‚           β”œβ”€β”€ AirQualityUCI_auc.json
β”‚           β”œβ”€β”€ AirQualityUCI_mmd.json
β”‚           β”œβ”€β”€ bike-usage_auc.json
β”‚           β”œβ”€β”€ bike-usage_mmd.json
β”‚           β”œβ”€β”€ cp_1_auc.json
β”‚           β”œβ”€β”€ cp_1_mmd.json
β”‚           β”œβ”€β”€ finance_auc.json
β”‚           β”œβ”€β”€ finance_mmd.json
β”‚           β”œβ”€β”€ fmri_auc.json
β”‚           β”œβ”€β”€ fmri_mmd.json
β”‚           β”œβ”€β”€ outdoor_auc.json
β”‚           β”œβ”€β”€ outdoor_mmd.json
β”‚           β”œβ”€β”€ WTH_auc.json
β”‚           └── WTH_mmd.json
β”œβ”€β”€ environment.yaml
β”œβ”€β”€ LICENSE
β”œβ”€β”€ README.md
└── requirements.txt

✨ Pretrained Weights

CP Weights (to be optionally included in Phase 1 of TCS -see simulation_configs.py) are provided outside GitHub due to size constraints, in the following Google Drive links:

πŸ“š Citation

If the codebase has proven useful, please consider citing the following article:

@misc{gkorgkolis2025temporal,
      title={Temporal Causal-based Simulation for Realistic Time-series Generation}, 
      author={Nikolaos Gkorgkolis and Nikolaos Kougioulis and MingXue Wang and Bora Caglayan and Andrea Tonon and Dario Simionato and Ioannis Tsamardinos},
      year={2025},
      eprint={2506.02084},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2506.02084}, 
}

πŸ₯° Contributing

Contributions are welcome! Feel free to:

  • Open issues for bugs, questions, or feature requests
  • Submit pull requests for improvements or new functionality

We follow standard GitHub practices for contributions, see our CONTRIBUTING file.