Python package for automatically constructing features from multiple time series
Install the latest release using pip:
pip install tsfuse
The example below shows the basic usage of TSFuse.
The input of TSFuse is a dataset where each instance is a window that consists of multiple time series and a label.
Time series are represented using a dictionary where each entry represents a univariate or multivariate time series. As an example, let's create a dictionary with two univariate time series:
from pandas import DataFrame
from tsfuse.data import Collection
X = {
"x1": Collection(DataFrame({
"id": [0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3],
"time": [0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2],
"data": [1, 2, 3, 1, 2, 3, 3, 2, 1, 3, 2, 1],
})),
"x2": Collection(DataFrame({
"id": [0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3],
"time": [0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2],
"data": [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3],
})),
}
The two univariate time series are named x1
and x2
and each series is represented as a Collection
object. Each Collection
is initialized with a DataFrame that has three columns:
id
which is the identifier of each instance, i.e., each window,time
which contains the time stamps,data
contains the time series data itself.
For multivariate time series data, there can be multiple columns similar to the data
column. For example, the data of a tri-axial accelerometer would have three columns x
, y
, z
instead of data
as it simultaneously measures the x
, y
, z
acceleration.
There should be one target value for each window, so we create a Series
where the index contains all unique id
values of the time series data and the data consists of the labels:
from pandas import Series
y = Series(index=[0, 1, 2, 3], data=[0, 0, 1, 1])
To construct features, TSFuse provides a construct
function which takes time series data X
and target data y
as input, and returns a DataFrame
where each column corresponds to a feature. In addition, this function can return a computation graph which contains all transformation steps required to compute the features for new data:
from tsfuse import construct
features, graph = construct(X, y, return_graph=True)
To apply this computation graph to new data, simply call transform
with a time series dictionary X
as input:
features = graph.transform(X)
The documentation is available on https://arnedb.github.io/tsfuse/
If you use TSFuse for a scientific publication, please consider citing this paper:
De Brabandere, A., Op De Beéck, T., Hendrickx, K., Meert, W., & Davis, J. TSFuse: automated feature construction for multiple time series data. Machine Learning (2022)
@article{tsfuse,
author = {De Brabandere, Arne
and Op De Be{\'e}ck, Tim
and Hendrickx, Kilian
and Meert, Wannes
and Davis, Jesse},
title = {TSFuse: automated feature construction for multiple time series data},
journal = {Machine Learning},
year = {2022},
doi = {10.1007/s10994-021-06096-2},
url = {https://doi.org/10.1007/s10994-021-06096-2}
}