Skip to content

Calculates various features from time series data. Python implementation of the R package tsfeatures.

License

Notifications You must be signed in to change notification settings

Nixtla/tsfeatures

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build PyPI version fury.io Downloads Python 3.6+ License: MIT

tsfeatures

Calculates various features from time series data. Python implementation of the R package tsfeatures.

Installation

You can install the released version of tsfeatures from the Python package index with:

pip install tsfeatures

Usage

The tsfeatures main function calculates by default the features used by Montero-Manso, Talagala, Hyndman and Athanasopoulos in their implementation of the FFORMA model.

from tsfeatures import tsfeatures

This function receives a panel pandas df with columns unique_id, ds, y and optionally the frequency of the data.

tsfeatures(panel, freq=7)

By default (freq=None) the function will try to infer the frequency of each time series (using infer_freq from pandas on the ds column) and assign a seasonal period according to the built-in dictionary FREQS:

FREQS = {'H': 24, 'D': 1,
         'M': 12, 'Q': 4,
         'W':1, 'Y': 1}

You can use your own dictionary using the dict_freqs argument:

tsfeatures(panel, dict_freqs={'D': 7, 'W': 52})

List of available features

Features
acf_features heterogeneity series_length
arch_stat holt_parameters sparsity
count_entropy hurst stability
crossing_points hw_parameters stl_features
entropy intervals unitroot_kpss
flat_spots lumpiness unitroot_pp
frequency nonlinearity
guerrero pacf_features

See the docs for a description of the features. To use a particular feature included in the package you need to import it:

from tsfeatures import acf_features

tsfeatures(panel, freq=7, features=[acf_features])

You can also define your own function and use it together with the included features:

def number_zeros(x, freq):

    number = (x == 0).sum()
    return {'number_zeros': number}

tsfeatures(panel, freq=7, features=[acf_features, number_zeros])

tsfeatures can handle functions that receives a numpy array x and a frequency freq (this parameter is needed even if you don't use it) and returns a dictionary with the feature name as a key and its value.

R implementation

You can use this package to call tsfeatures from R inside python (you need to have installed R, the packages forecast and tsfeatures; also the python package rpy2):

from tsfeatures.tsfeatures_r import tsfeatures_r

tsfeatures_r(panel, freq=7, features=["acf_features"])

Observe that this function receives a list of strings instead of a list of functions.

Comparison with the R implementation (sum of absolute differences)

Non-seasonal data (100 Daily M4 time series)

feature diff feature diff feature diff feature diff
e_acf10 0 e_acf1 0 diff2_acf1 0 alpha 3.2
seasonal_period 0 spike 0 diff1_acf10 0 arch_acf 3.3
nperiods 0 curvature 0 x_acf1 0 beta 4.04
linearity 0 crossing_points 0 nonlinearity 0 garch_r2 4.74
hw_gamma 0 lumpiness 0 diff2x_pacf5 0 hurst 5.45
hw_beta 0 diff1x_pacf5 0 unitroot_kpss 0 garch_acf 5.53
hw_alpha 0 diff1_acf10 0 x_pacf5 0 entropy 11.65
trend 0 arch_lm 0 x_acf10 0
flat_spots 0 diff1_acf1 0 unitroot_pp 0
series_length 0 stability 0 arch_r2 1.37

To replicate this results use:

python -m tsfeatures.compare_with_r --results_directory /some/path
                                    --dataset_name Daily --num_obs 100

Sesonal data (100 Hourly M4 time series)

feature diff feature diff feature diff feature diff
series_length 0 seas_acf1 0 trend 2.28 hurst 26.02
flat_spots 0 x_acf1 0 arch_r2 2.29 hw_beta 32.39
nperiods 0 unitroot_kpss 0 alpha 2.52 trough 35
crossing_points 0 nonlinearity 0 beta 3.67 peak 69
seasonal_period 0 diff1_acf10 0 linearity 3.97
lumpiness 0 x_acf10 0 curvature 4.8
stability 0 seas_pacf 0 e_acf10 7.05
arch_lm 0 unitroot_pp 0 garch_r2 7.32
diff2_acf1 0 spike 0 hw_gamma 7.32
diff2_acf10 0 seasonal_strength 0.79 hw_alpha 7.47
diff1_acf1 0 e_acf1 1.67 garch_acf 7.53
diff2x_pacf5 0 arch_acf 2.18 entropy 9.45

To replicate this results use:

python -m tsfeatures.compare_with_r --results_directory /some/path \
                                    --dataset_name Hourly --num_obs 100

Authors