@inproceedings{murad2025wpmixer,
title={Wpmixer: Efficient multi-resolution mixing for long-term time series forecasting},
author={Murad, Md Mahmuddun Nabi and Aktukmak, Mehmet and Yilmaz, Yasin},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={39},
number={18},
pages={19581--19588},
year={2025}
}- [Nov 2025] π₯π₯π₯ Added full instructions for the optuna hyper-parameter tuning.
- [May 2025] Added a script for hyperparameter tuning using Optuna under
./scripts/HyperParameter_Tuning/. These scripts explore optimal hyperparameter settings for ETT dataset.
Follow these steps to get started with WPMixer:
Install Python 3.10 and the necessary dependencies.
pip install -r requirements.txtProcess-1:
Download the zip file of the datasets from the link.
Paste the zip file inside the root folder and extract. Now you will have ./data/ folder containing all the datasets.
Or,
Process-2:
Download the data and locate them in the ./data/ folder. You can download all data from the public GitHub repo: Autoformer or TimeMixer. All the datasets are well-pre-processed and can be used easily. To place and rename the datasets file, check the following folder tree,
data
βββ electricity
β βββ electricity.csv
βββ ETT
β βββ ETTh1.csv
β βββ ETTh2.csv
β βββ ETTm1.csv
β βββ ETTm2.csv
βββ exchange_rate
β βββ exchange_rate.csv
βββ illness
β βββ national_illness.csv
βββ m4
βββ solar
β βββ solar_AL.txt
βββ traffic
β βββ traffic.csv
βββ weather
βββ weather.csv
We provide the experiment scripts of all benchmarks under the folder ./scripts/ to reproduce the results. Running those scripts by the following commands will generate logs in the ./logs/WPMixer/ folder.
bash ./scripts/Full_HyperSearch/ETTh1_full_hyp.sh
bash ./scripts/Full_HyperSearch/ETTh2_full_hyp.sh
bash ./scripts/Full_HyperSearch/ETTm1_full_hyp.sh
bash ./scripts/Full_HyperSearch/ETTm2_full_hyp.sh
bash ./scripts/Full_HyperSearch/Weather_full_hyp.sh
bash ./scripts/Full_HyperSearch/Electricity_full_hyp.sh
bash ./scripts/Full_HyperSearch/Traffic_full_hyp.sh
bash ./scripts/Unified/ETTh1_Unified_setup.sh
bash ./scripts/Unified/ETTh2_Unified_setup.sh
bash ./scripts/Unified/ETTm1_Unified_setup.sh
bash ./scripts/Unified/ETTm2_Unified_setup.sh
bash ./scripts/Unified/Weather_Unified_setup.sh
bash ./scripts/Unified/Electricity_Unified_setup.sh
bash ./scripts/Unified/Traffic_Unified_setup.sh
bash ./scripts/Univariate/ETTh1_univariate.sh
bash ./scripts/Univariate/ETTh2_univariate.sh
bash ./scripts/Univariate/ETTm1_univariate.sh
bash ./scripts/Univariate/ETTm2_univariate.sh
Following explains how to run the hyper-tuning scripts for WPMixer, how to organize logs, how to specify datasets, and how each parameter works.
WPMixer supports automatic hyper-parameter optimization using Optuna. You can run hyper-tuning for one or multiple datasets and automatically search over:
- Learning rates
- Sequence lengths
- Batch size
- Wavelet types
- Decomposition levels
- Patch lengths & strides
- Dropout choices
- Temporal expansion and embedding expansion factors
All results are logged under ./logs/WPMixer/.
Following script will also be found in ./scripts/HyperParameter_Tuning/ETT_optuna_unified.sh
if [ ! -d "./logs" ]; then
mkdir ./logs
fi
if [ ! -d "./logs/WPMixer" ]; then
mkdir ./logs/WPMixer
fi
export CUDA_VISIBLE_DEVICES=0
# General
model_name=WPMixer
task_name=long_term_forecast
python -u main_run2.py \
--task_name $task_name \
--model $model_name \
--use_hyperParam_optim \
--datasets ETTh1 ETTh2 \
--pred_lens 192 \
--loss smoothL1 \
--use_amp \
--n_jobs 1 \
--optuna_lr 0.00001 0.01 \
--optuna_batch 128 \
--optuna_wavelet db2 db3 db5 sym2 sym3 sym4 sym5 coif4 coif5 \
--optuna_seq_len 96 192 336 \
--optuna_tfactor 3 5 7 \
--optuna_dfactor 3 5 7 8 \
--optuna_epochs 10 \
--optuna_dropout 0.0 0.05 0.1 0.2 0.4 \
--optuna_embedding_dropout 0.0 0.05 0.1 0.2 0.4 \
--optuna_patch_len 16 \
--optuna_stride 8 \
--optuna_lradj type3 \
--optuna_dmodel 128 256 \
--optuna_weight_decay 0.0 \
--optuna_patience 5 \
--optuna_level 1 2 3 \
--optuna_trial_num 200 >logs/WPMixer/ETTh_192_with_decomposition.logThis section explains every parameter used in the hyper-tuning script.
| Parameter | Description |
|---|---|
--task_name |
Task type (e.g., long_term_forecast). |
--model |
Model name (WPMixer). |
--loss |
Loss function (smoothL1, MSE). |
--use_amp |
Enables automatic mixed precision for speed. |
--datasets |
One or multiple datasets to tune together. |
--pred_lens |
Prediction horizon (e.g., 96). |
--n_jobs |
Number of parallel Optuna workers. |
| Parameter | Description |
|---|---|
--optuna_lr lr_min lr_max |
Learning rate search interval from minimum LR to max.LR. |
--optuna_batch |
Candidate batch sizes. |
--optuna_wavelet |
Wavelet families for multi-resolution decomposition. Set a list of wavelet type to find the optimum one. |
--optuna_seq_len |
Multiple input sequence lengths to find the optimum one. |
--optuna_tfactor |
Multiple temporal mixing factors to find the optimum one. |
--optuna_dfactor |
Multiple embedding mixing factors to find the optimum one. |
--optuna_epochs |
Max epochs for each trial. |
--optuna_dropout |
Multiple Dropout search values to find the optimum one. |
--optuna_embedding_dropout |
Multiple Embedding Dropout search values to find the optimum one. |
--optuna_patch_len |
Length of each patch. |
--optuna_stride |
Stride between patches. |
--optuna_lradj |
Learning rate scheduler policy. |
--optuna_dmodel |
Multiple embedding dimensions to find the optimum one. |
--optuna_weight_decay |
Weight decay coefficient for AdamW. |
--optuna_patience |
Early stopping patience. |
--optuna_level |
Multiple Wavelet Decomposition levels to find the optimum one. |
--optuna_trial_num |
Total number of Optuna trials. |
When selecting optuna_patch_len and optuna_level, ensure their compatibility with the effective sequence length at the deepest wavelet decomposition branch.
If you use
--optuna_level m
Then:
- The model uses
$m$ detailed resolution branches - And
$1$ approximation branch - The minimum effective sequence length among all detailed branches becomes:
optuna_seq_len / 2^m
To avoid invalid patching, you must satisfy: (optuna_seq_len / 2^m) > optuna_patch_len
Example:
optuna_seq_len = 96
optuna_level = 3 β effective_seq_len = 96 / 8 = 12
optuna_patch_len = 16 β invalid (12 < 16)
Time series forecasting is crucial for various applications, such as weather forecasting, power load forecasting, and financial analysis. In recent studies, MLP-mixer models for time series forecasting have been shown as a promising alternative to transformer-based models. However, the performance of these models is still yet to reach its potential. In this paper, we propose Wavelet Patch Mixer (WPMixer), a novel MLP-based model, for long-term time series forecasting, which leverages the benefits of patching, multi-resolution wavelet decomposition, and mixing. Our model is based on three key components: (i) multi-resolution wavelet decomposition, (ii) patching and embedding, and (iii) MLP mixing. Multi-resolution wavelet decomposition efficiently extracts information in both the frequency and time domains. Patching allows the model to capture an extended history with a look-back window and enhances capturing local information while MLP mixing incorporates global information. Our model significantly outperforms state-of-the-art MLP-based and transformer-based models for long-term time series forecasting in a computationally efficient way, demonstrating its efficacy and potential for practical applications.



