Python code for the model and experiments of the paper:
"Embedding the Learning of Multivariate Shapelets in a Multi-Layer Neural Network" by Roberto Medico, Joeri Ruyssinck, Dirk Deschrijver and Tom Dhaene, Ghent University
in submission for KDD'18.
A Neural Network architecture to learn meaningful multivariate shapelets for time-series classification tasks.
Shapelets are discriminative subsequences extracted from time-series data. Classifiers using shapelets have proven to achieve performances competitive to state-of-the-art methods, while enhancing the model's interpretability. While a lot of research has been done for univariate time-series shapelets, extensions for the multivariate setting have not yet received much attention. To extend shapelets-based classification to a multidimensional setting, we developed a novel architecture for shapelets learning, by embedding them as trainable weights in a multi-layer Neural Network. We also investigated a novel initialization strategy for the shapelets, based on meaningful multidimensional motif discovery using the Matrix Profile, a recently proposed time series analysis tool. This paper describes the proposed architecture and presents results on seven publicly available benchmark datasets. Our results show how the proposed approach achieves competitive performance across the datasets, and, in contrast with the existing discovery-based methods, is applicable to larger-scale datasets. Moreover, the proposed motif-based initialization strategy helps the model convergence and performance, as well as improving interpretability of the learnt shapelets. Finally, the shapelets learnt during training can be extracted from the model and serve as meaningful insights on the classifier's decisions and the interactions between different dimensions.
- Python 2.7
- Keras with Tensorflow backend
- mSTAMP (multidimensional Matrix Profile)
- Scientific computing libraries: Numpy, Pandas, Matplotlib, Sklearn
All the datasets used for evaluation were collected and made available to us by the authors of A Shapelet Transform for Multivariate Time Series Classification in ARFF format.
We merged dataset_TRAIN.arff and dataset_TEST.arff split (if needed), and converted the full datasets to .csv