A backup for the pymfe expansion for time-series data. Currently, this repository contains the methods for meta-feature extraction and an modified pymfe core to run extract the meta-features.
Please note that tspymfe is not intended to be a stand-alone package, and will be oficially merged (hopefully soon) to the original Pymfe package. Until then, this package is available as a beta version.
There is 149 distinct metafeature extraction methods in this version, distributed in the following groups:
- General
- Local statistics
- Global statistics
- Statistical tests
- Autocorrelation
- Frequency domain
- Information theory
- Randomize
- Landmarking
- Model based
From pip:
pip install -U tspymfe
or:
python3 -m pip install -U tspymfe
To extract the meta-features, the API behaves pretty much like the original Pymfe API:
import tspymfe.tsmfe
import numpy as np
# random time-series
ts = 0.3 * np.arange(100) + np.random.randn(100)
extractor = tspymfe.tsmfe.TSMFE()
extractor.fit(ts)
res = extractor.extract()
print(res)
If you downloaded directly from github, install the required packages using:
pip install -Ur requirements.txt
You can run some test scripts:
python test_a.py <data_id> <random_seed> <precomp 0/1>
python test_b.py <data_id> <random_seed> <precomp 0/1>
Where the first argument is the test time-series id (check data/comp-engine-export-sample.20200503.csv file.) and must be between 0 (inclusive) and 19 (also inclusive), the random seed must be an integer, and precomp is a boolean argument ('0' or '1') to activate the precomputation methods, used to calculate common values between various methods and, therefore, speed the main computations.
Example:
python test_a.py 0 16 1
python test_b.py 0 16 1
The code format style is checked using flake8, pylint and mypy. You can use the Makefile to run all verifications by yourself:
pip install -Ur requirements-dev.txt
make code-check
Below I present the full list of available meta-features in this package separated by meta-feature group. Also note that you can use the following methods to recover the available meta-feature, groups, and summary functions:
import tspymfe.tsmfe
groups = tspymfe.tsmfe.TSMFE.valid_groups()
print(groups)
metafeatures = tspymfe.tsmfe.TSMFE.valid_metafeatures()
print(metafeatures)
summaries = tspymfe.tsmfe.TSMFE.valid_summary()
print(summaries)
- model_arima_010_c
- model_arima_011_c
- model_arima_011_nc
- model_arima_021_c
- model_arima_100_c
- model_arima_110_c
- model_arima_112_nc
- model_exp
- model_gaussian
- model_hwes_ada
- model_hwes_adm
- model_linear
- model_linear_acf_first_nonpos
- model_linear_embed
- model_linear_seasonal
- model_loc_mean
- model_loc_median
- model_mean
- model_mean_acf_first_nonpos
- model_naive
- model_naive_drift
- model_naive_seasonal
- model_ses
- model_sine
- bin_mean
- cao_e1
- cao_e2
- diff
- emb_dim_cao
- emb_lag
- embed_in_shell
- fnn_prop
- force_potential
- frac_cp
- fs_len
- length
- moving_threshold
- peak_frac
- period
- pred
- step_changes
- step_changes_trend
- stick_angles
- trough_frac
- turning_points
- turning_points_trend
- walker_cross_frac
- walker_path
- corr_dim
- dfa
- exp_hurst
- exp_max_lyap
- ioe_tdelta_mean
- kurtosis_diff
- kurtosis_residuals
- kurtosis_sdiff
- opt_boxcox_coef
- sd_diff
- sd_residuals
- sd_sdiff
- season_strenght
- skewness_diff
- skewness_residuals
- skewness_sdiff
- spikiness
- t_mean
- trend_strenght
- local_extrema
- local_range
- lumpiness
- moving_acf
- moving_acf_shift
- moving_approx_ent
- moving_avg
- moving_avg_shift
- moving_gmean
- moving_gmean_shift
- moving_kldiv
- moving_kldiv_shift
- moving_kurtosis
- moving_kurtosis_shift
- moving_lilliefors
- moving_sd
- moving_sd_shift
- moving_skewness
- moving_skewness_shift
- moving_var
- moving_var_shift
- stability
- avg_cycle_period
- curvature
- des_level
- des_trend
- ets_level
- ets_season
- ets_trend
- gaussian_r_sqr
- ioe_std_adj_r_sqr
- ioe_std_slope
- linearity
- low_freq_power
- ps_entropy
- ps_freqs
- ps_peaks
- ps_residuals
- test_adf
- test_adf_gls
- test_dw
- test_earch
- test_kpss
- test_lb
- test_lilliefors
- test_pp
- test_za
- acf
- acf_detrended
- acf_diff
- acf_first_nonpos
- acf_first_nonsig
- autocorr_crit_pt
- autocorr_out_dist
- first_acf_locmin
- gen_autocorr
- gresid_autocorr
- gresid_lbtest
- pacf
- pacf_detrended
- pacf_diff
- tc3
- trev
- itrand_acf
- itrand_mean
- itrand_sd
- resample_first_acf_locmin
- resample_first_acf_nonpos
- resample_std
- surr_tc3
- surr_trev
- ami
- ami_curvature
- ami_detrended
- ami_first_critpt
- approx_entropy
- control_entropy
- hist_ent_out_diff
- hist_entropy
- lz_complexity
- sample_entropy
- surprise
- T.S. Talagala, R.J. Hyndman and G. Athanasopoulos. Meta-learning how to forecast time series (2018)..
- Kang, Yanfei., Hyndman, R.J., & Smith-Miles, Kate. (2016). Visualising Forecasting Algorithm Performance using Time Series Instance Spaces (Department of Econometrics and Business Statistics Working Paper Series 10/16).
- C. Lemke, and B. Gabrys. Meta-learning for time series forecasting and forecast combination (Neurocomputing Volume 73, Issues 10–12, June 2010, Pages 2006-2016)
- B.D. Fulcher and N.S. Jones. hctsa: A computational framework for automated time-series phenotyping using massive feature extraction. Cell Systems 5, 527 (2017).
- B.D. Fulcher, M.A. Little, N.S. Jones. Highly comparative time-series analysis: the empirical structure of time series and their methods. J. Roy. Soc. Interface 10, 83 (2013).
Data sampled from: https://comp-engine.org/