TSFM-Bench: A COMPREHENSIVE AND UNIFIED BENCHMARKING OF FOUNDATION MODELS FOR TIME SERIES FORECASTING
Time Series Forecasting (TSF) is key functionality in numerous fields, such as financial investment, weather services, and energy management. Although increasingly capable TSF methods occur, many of them require domain-specific data collection and model training and do not generalize well when applied in other domains. Time Series Foundation Models (TSFMs) that are pre-trained on massive heterogeneous time series data aim to overcome these limitations. The prospects for generalizability have spurred the development of a new generation of TSFMs. This study proposes a benchmark, fewTSFM-Bench, to facilitate comprehensive and unified evaluation of TSFMs. fewTSFM-Bench covers a wide range of TSFMs, including those based on large language models and those pre-trained on time series data. fewTSFM-Bench supports multiple forecasting scenarios, including zero-shot, few-shot, and full-shot, enabling assessment across the full range of adaptation strategies. fewTSFM-Bench also provides a standardized experimental protocols for critical evaluation processes such as dataset splitting, loading, normalization, and few-shot sampling, facilitating consistency and fairness. We report on an extensive evaluation of TSFMs across a diverse range of datasets spanning multiple domains and exhibiting varied statistical characteristics. Specifically, we identify pros and cons and inherent limitations of existing TSFMs, and we propose potential directions for new model designs. fewTSFM-Bench is available at https://github.com/decisionintelligence/TSFM-Bench.
-
Clone repository:
git clone git@github.com:decisionintelligence/TSFM-Bench.git cd TSFM-Bench
-
Create virtual environment
conda create -n "TSFM-Bench" python=3.10 conda activate TSFM-Bench pip install -r requirements.txt
You can obtained the well pre-processed datasets from Google Drive. Create a separate folder named ./dataset
-
We provide checkpoints for the basic model used in the paper. Please download the checkpoints from Google Drive.
-
You can also download the checkpoints from the following link. Please place the
checkpoint_llm
folder under./ts_benchmark/baselines/LLM/
and rename it ascheckpoints
. thecheckpoint_pretrain
folder under./ts_benchmark/baselines/pre_train/
and rename it ascheckpoints
.:Model Link Chronos
Huggingface TimesFM
Huggingface Timer
Google Drive UniTS
Github TinyTimeMixer
Huggingface Moment
Huggingface MOIRAI
Huggingface GPT-2
Huggingface -
Some model-specific requirements When you want to test the CALF, please refer to the link. When you want to test the AutoTimes, please refer to the link.
-
We provide the experiment scripts for all models under the folder
./scripts
. For example you can reproduce a experiment result as the following:# Zero-Shot sh ./scripts/pre_train_model/zero_shot/ETTh1_scripts/TTM.sh
-
When you want to write your own script, please pay attention to changing the following values to meet different testing requirements:
is_train
,sampling_rate
,sampling_basis
,sampling_strategy
.# "is_train" = 1, 0 # "sampling_rate" = 0.05 (0~1) # "sampling_basis" = "sample", "data" # "sampling_strategy = "uniform", "random", "begin", "end" python ./scripts/run.py --config-path "rolling_forecast_config.json" --data-name-list "ETTh1.csv" --strategy-args '{"horizon":96}' --model-name "pre_train.UniTS" --model-hyper-params '{"horizon": 96, "seq_len": 512, "target_dim": 7, "dataset": "etth1", "is_train": 1, "freq": "h", "sampling_rate": 0.05, "sampling_strategy": "uniform", "sampling_basis": "sample"}' --adapter "PreTrain_adapter" --gpus 0 --num-workers 1 --timeout 60000 --save-path "TEST"