FlexTSF: A Flexible Forecasting Model for Time Series with Variable Regularities

Overview

This is the official PyTorch implementation of the paper FlexTSF: A Flexible Forecasting Model for Time Series with Variable Regularities.

This repository provides the source code implementation for:

FlexTSF model
Data preprocessing
Classic training
Pre-training
Zero-shot
Few-shot

Recently, there has been significant research on universal time series forecasting models, which can be directly applied to various domains after pre-training. However, alongside broader applications, a key challenge arises: temporal irregularity—the existence of missing values, uneven time intervals, and variable sequence lengths. We illustrate temporal irregularity in the following figure. (a) shows regularly sampled data; (b) depicts missing data due to events like holidays; (c) presents blood pressure measurements becoming denser as a patient's condition worsens; (d) shows irregular satellite observations influenced by atmospheric conditions such as clouds and fog.

We propose FlexTSF, the first universal forecasting model built from the perspective of breaking data regularity constraints. FlexTSF not only performs well on data with temporal irregularity but is also broadly applicable across domains with various temporal granularities. As shown in the following figure, FlexTSF employs a decoder-only architecture, where time series input data is organized into patches. Previously observed patches attend to the generation of future patches, which are then transformed into forecasts. Built on this backbone, FlexTSF introduces a novel patching module and a domain self-adaptation mechanism.

Specifically,

We introduce IVP Patcher, a continuous-time patching module, to handle irregular time series, overcoming limitations of traditional fixed-size patching methods.
We propose a timestamp normalization scheme and a forefront computing node, enabling domain-aware adaptation and improving cross-domain generalization.

Our evaluation on 16 benchmark datasets demonstrates that FlexTSF achieves the lowest MSE in 22 of 24 irregular-forecasting tasks (8 datasets × 3 horizons), exhibits better robustness across varying missing rates, and significantly outperforms state-of-the-art baselines in the zero-shot setting. Ablations further confirm the contribution of each component and the benefits of using random patch lengths.

1. Requirements

FlexTSF has been tested with Python 3.10 using the Conda environment management tool.

To ensure consistent library versions, you can install the required dependencies for this project by running the following command:

conda env create -f environment.yml

As some libraries are frequently updated, we have included two specific versions of the dependencies (torchdiffeq and stribor) in the "libs" folder to ensure successful execution of the code.

2. Datasets

We conduct three stages of experiments using two non-overlapping groups of datasets: the pre-training datasets $\mathcal{D}{p}$ and the held-out datasets $\mathcal{D}{h}$.

2.1 $\mathcal{D}_{p}$

Obtain: Our pre-training dataset group $\mathcal{D}_{p}$ consists of datasets from the Monash Time Series Forecasting Archive and the UCR & UEA Time Series Classification Archive.
Preprocess: The preprocessing programs can be found in folder "preprocess/pre_monash_tsc".

After processing, $\mathcal{D}_{p}$ consists of 2.4 million sequences with lengths ranging from 18 to 1024, spanning domains such as tourism, banking, energy, sales, economics, transportation, nature, web, and health.

2.2 $\mathcal{D}_{h}$

ETTh1, ETTh2, ETTm1, ETTm2, ExRate, Illness, Weather, Electricity

Obtain: They are from the Long Time Series Forecasting Benchmark.
Preprocess: The illness dataset was extended (preprocess/pre_ltf) so that it supports the same input/output data length as the other datasets. No preprocessing is required for the others. The data can be read directly by the function in file "experiments/data_ltf.py".

SaTSM

Obtain: This dataset can be downloaded from https://zenodo.org/records/5712933.
Preprocess: The preprocessing programs can be found in folder "preprocess/pre_satsm".

MetrLA

Obtain: This dataset can be downloaded from https://github.com/liyaguang/DCRNN.
Preprocess: The preprocessing programs can be found in folder "preprocess/pre_traffic".

ArabDigit, CharTraj

Obtain: They are from the UEA Time Series Classification Archive and have been removed from the pre-training set to ensure no overlap between the pre-training set and the held-out set.
Preprocess: No preprocessing is required, the data can be read directly by the function in file "experiments/data_ucruea.py".

HAR

Obtain: We used the dataset Localization Data for Person Activity.
Preprocess: The data downloading and reading programs can be found in file "experiments/data_harw4imu.py".

eICU

Obtain: We used eICU v2.0, which can be downloaded from https://physionet.org/content/eicu-crd/2.0/.
Preprocess: The preprocessing programs can be found in folder "preprocess/pre_eicu". They were developed upon a previous work.

Physio12

Obtain: We used PhysioNet 2012 v1.0, which can be downloaded from https://physionet.org/content/challenge-2012/1.0.0/.
Preprocess: The automatic downloading and preprocessing code is in file "experiments/data_physionet12.py". This file was built upon a previous program.

MIMIC4

Obtain: We used MIMIC-IV v1.0, which can be downloaded from https://physionet.org/content/mimiciv/1.0/.
Preprocess: The preprocessing programs can be found in folder "preprocess/pre_mimic4".

3. Experiments

In the first stage, we perform classic training-validation-testing experiments to demonstrate the effectiveness of FlexTSF. Next, we pre-train FlexTSF, resulting in a model with 63 million parameters. This model is initially used for zero-shot forecasting to evaluate its potential as a universal model, and is then fine-tuned for time series forecasting to assess its adaptability to new domains in few-shot scenarios.

Each dataset in $\mathcal{D}_{h}$ is split into training, validation, and testing sets, following their original splits if known or a split ratio of 8:1:1, otherwise.

For regular datasets, we use a fixed input length of 96 and a forecasting horizon of 96. For irregular datasets, it is impractical to define fixed input and output lengths across all cases. Instead, we adopt a forecast-to-input ratio and vary it across {0.25, 0.5, 1.0} to evaluate performance under different forecasting horizons.

Running the code

You can use the following commands to run the program. VS Code users can also check out the file .vscode/launch.json, which may be more convenient for trying out the programs.

3.1 Classic Training

Run FlexTSF on a specific dataset:

python main.py --base_model flextsf --ml_task forecast --value_norm --time_norm --patch_seg random --data_name eICU

Run FlexTSF on all irregular datasets:

python main.py --base_model flextsf --ml_task forecast --value_norm --time_norm --patch_seg random --data_group irregular

Run FlexTSF on all regular datasets with 20% missing rate:

python main.py --base_model flextsf --ml_task forecast --value_norm --time_norm --patch_seg random --data_group regular --ddr 0.2

3.2 Pre-training

Pre-train FlexTSF:

python main.py --base_model flextsf --data_name monash --attn_layers 6 --nhead 12 --dim_attn_internal 768 --dim_patch_ts 768 --dim_ivp_hidden 768 --ml_task uni_pretrain --value_norm --time_norm --weight_decay 0.1 --epochs_max 20 --dev_mode run

3.3 Zero-shot

Deploy pre-trained FlexTSF in zero-shot settings:

python main.py --base_model flextsf --ml_task forecast --value_norm --time_norm --train_setting zero --attn_layers 6 --nhead 12 --dim_attn_internal 768 --dim_patch_ts 768 --dim_ivp_hidden 768 --pre_random_seed 1 --zeroshot_epoch 5 --fore_len 0.25

--zeroshot_epoch 5: We use the model that has been trained for 6 epochs.

3.4 Few-shot

Deploy pre-trained FlexTSF in the few-shot setting with 50 fine-tuning samples:

python main.py --base_model flextsf --model_type reconstruct --ml_task forecast --value_norm --time_norm --train_setting few --pre_model {patch_ckpt} --attn_layers 6 --nhead 12 --dim_attn_internal 768 --dim_patch_ts 768 --dim_ivp_hidden 768 --pre_random_seed 1 --few_shot_config 50 --fore_len 0.25

{patch_ckpt} specifies the path of the checkpoint. We used the model that had been trained for 20 epochs.

3.5 Ablation Study

3.5.1 Without IVP Patcher

Pre-train the model:

python main.py --base_model flextsf --data_name monash --attn_layers 6 --nhead 12 --dim_attn_internal 768 --dim_patch_ts 768 --dim_ivp_hidden 768 --ml_task uni_pretrain --value_norm --time_norm --patch_module none --patch_len_pretrain 1 --batch_size 16 --weight_decay 0.1 --epochs_max 20

Run zero-shot experiments:

python main.py --base_model flextsf --ml_task forecast --value_norm --time_norm --train_setting zero --attn_layers 6 --nhead 12 --dim_attn_internal 768 --dim_patch_ts 768 --dim_ivp_hidden 768 --patch_module none --patch_seg given --patch_len 1 --pre_random_seed 1 --zeroshot_epoch 5 --fore_len 0.25

3.5.2 Without the time normalization

Pre-train the model:

python main.py --base_model flextsf --data_name monash --attn_layers 6 --nhead 12 --dim_attn_internal 768 --dim_patch_ts 768 --dim_ivp_hidden 768 --ml_task uni_pretrain --value_norm --weight_decay 0.1 --epochs_max 20

Run zero-shot experiments:

python main.py --base_model flextsf --ml_task forecast --value_norm --train_setting zero --attn_layers 6 --nhead 12 --dim_attn_internal 768 --dim_patch_ts 768 --dim_ivp_hidden 768 --pre_random_seed 1 --zeroshot_epoch 5 --fore_len 0.25

3.5.3 Without the Leader node

Pre-train the model:

python main.py --base_model flextsf --data_name monash --attn_layers 6 --nhead 12 --dim_attn_internal 768 --dim_patch_ts 768 --dim_ivp_hidden 768 --ml_task uni_pretrain --value_norm --time_norm --leader_node --weight_decay 0.1 --epochs_max 20

Run zero-shot experiments:

python main.py --base_model flextsf --ml_task forecast --value_norm --time_norm --train_setting zero --attn_layers 6 --nhead 12 --dim_attn_internal 768 --dim_patch_ts 768 --dim_ivp_hidden 768 --pre_random_seed 1 --leader_node --zeroshot_epoch 5 --fore_len 0.25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FlexTSF: A Flexible Forecasting Model for Time Series with Variable Regularities

Overview

1. Requirements

2. Datasets

2.1 $\mathcal{D}_{p}$

2.2 $\mathcal{D}_{h}$

ETTh1, ETTh2, ETTm1, ETTm2, ExRate, Illness, Weather, Electricity

SaTSM

MetrLA

ArabDigit, CharTraj

HAR

eICU

Physio12

MIMIC4

3. Experiments

Running the code

3.1 Classic Training

3.2 Pre-training

3.3 Zero-shot

3.4 Few-shot

3.5 Ablation Study

3.5.1 Without IVP Patcher

3.5.2 Without the time normalization

3.5.3 Without the Leader node

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.vscode		.vscode
experiments		experiments
images		images
libs		libs
models		models
preprocess		preprocess
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
main.py		main.py

jingge326/FlexTSF

Folders and files

Latest commit

History

Repository files navigation

FlexTSF: A Flexible Forecasting Model for Time Series with Variable Regularities

Overview

1. Requirements

2. Datasets

2.1 $\mathcal{D}_{p}$

2.2 $\mathcal{D}_{h}$

ETTh1, ETTh2, ETTm1, ETTm2, ExRate, Illness, Weather, Electricity

SaTSM

MetrLA

ArabDigit, CharTraj

HAR

eICU

Physio12

MIMIC4

3. Experiments

Running the code

3.1 Classic Training

3.2 Pre-training

3.3 Zero-shot

3.4 Few-shot

3.5 Ablation Study

3.5.1 Without IVP Patcher

3.5.2 Without the time normalization

3.5.3 Without the Leader node

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages