Skip to content

Releases: predict-idlab/tsflex

v0.4.1

06 Sep 09:28
Compare
Choose a tag to compare

What's Changed

  • ♻️ improve QOL with mypy by @jvdd in #120
  • ⬆️ update pycatch22 by @jvdd in #122
  • Update dependencies by @jvdd in #123
  • Update deps ctd. by @jvdd in #124
  • 🐛 check for None before setting n_jobs to nb logical CPUs by @jvdd in #126
  • 🙏 not writeable strided for vectorized by @jvdd in #128
  • ⬆️ update dependencies by @jvdd in #130

Full Changelog: v0.4.0...v0.4.1

v0.4.0

04 Apr 10:31
Compare
Choose a tag to compare

New features

Now you can utilize the FeatureCollection.calculate method to compute feature based on group ids.

Specifically, 2 arguments wre atted to this FeatureCollection.calculate method:

  • group_by_all: creates groups that contains all rows corresponding to the group value
    • Note that this is +/- identical as passing df.groupby(group_by_all) as data to the .calculate method -> (which is now also a valid input for the data argument)
  • group_by_consecutive: creates groups that contain consecutive rows for the group value

Note: Both grouped feature extraction approaches ignore NaNs in the group_by column.

Curious? :Look at our verbose example notebook - grouping feature extraction

What's Changed

  • 🎍 improving loggers as described in #66 by @jonasvdd in #73
  • 🧹 some necessary maintenance by @jvdd in #80
  • 🪵 log % duration + output_names for FeatureCollection by @jvdd in #83
  • ✨ validate integration with antropy by @jvdd in #88
  • 🎉 validate nolds integration by @jvdd in #94
  • 〰️ remove isort and use ruff instead by @jvdd in #99
  • ⬆️ support Python 3.11 by @jvdd in #87
  • 🎉 validate pyentrp integration by @jvdd in #95
  • 🐛 support functools.partial by @jvdd in #104
  • 👷 build: create codeql.yml by @NielsPraet in #106
  • Build/codspeed setup by @NielsPraet in #107
  • ⬆️ update antropy dependency + disable py 3.7 tests by @jvdd in #108
  • ⬆️ update dependencies by @jvdd in #111
  • ✨ feat: Feature extraction with an identifier by @NielsPraet in #109
  • ⬆️ soften pandas lock by @jvdd in #115
  • 🚀 Python 3.12 support by @jvdd in #116

New Contributors

Full Changelog: v0.3.0...v0.4.0

tsflex v0.3.0

23 Feb 13:56
9787fc0
Compare
Choose a tag to compare

What's Changed

  • 🥅 update make_robust by @jvdd in #45
  • 🎨 Update examples by @jvdd in #46
  • 💨 series_pipeline insert & append is more compose-like by @jonasvdd in #47
  • 📌 Update dependencies by @jvdd in #48
  • 🙈 minor bug fix in make_robust by @jvdd in #52
  • 🚑 Fix windows bug by @jvdd in #53
  • 🚑 update critical depencies by @jvdd in #55
  • ✨ vectorized feature function support by @jvdd in #58
  • [MRG] Remove deprecated closed argument in pd.daterange by @jeroenboeye in #64
  • 🐛 fix bug with bound_method + ✨ new integrations by @jvdd in #62
  • ♻️ improve output indexing by @jvdd in #68
  • 🖍️ improve the make_robust docs by @jonasvdd in #72
  • ✨ decouple stride + support setpoints by @jvdd in #74
  • ♻️ refactor indexing + ✂️ decouple stride & window + ✨ support segment idxs by @jvdd in #71

Major changes

Updates on the output indexing

✂️ Decoupling of stride & window from FeatureDescriptors

  • Both argments are now optional.
  • The stride can now also be a list of stride values

✨ Support segment indexes

  • Users can now add their start & end segment indexes to the FeatureCollection.calculate method - allowing even more flexible feature extraction 😉

Full Changelog: v0.2.3...v0.3


DISCLAIMER: this release was already published some months ago (11 Oct, 2022) on pypi.
Our apologies for the late tag + release on GitHub.

tsflex v0.2.3

16 Nov 21:32
Compare
Choose a tag to compare

❗ See also: tsflex v0.2.2 which is even more 🔥 than this one

New features

💚 Next to the tsfresh integrations, tsflex's feature extraction now fully integrates with seglearn and tsfel ⬇️

from seglearn.feature_functions import base_features
from tsfel.feature_extraction import get_features_by_domain

from tsflex.features import FeatureCollection, MultipleFeatureDescriptors
from tsflex.features.integrations import seglearn_feature_dict_wrapper, tsfel_feature_dict_wrapper
from tsflex.utils.data import load_empatica_data

# Load sequence-indexed data (in this case a time-index)
df_tmp, df_acc = load_empatica_data(['tmp', 'acc'])

# Construct your feature extraction configuration & extract features
fc = FeatureCollection(
    MultipleFeatureDescriptors(
        functions=[
            *seglearn_feature_dict_wrapper(base_features()),
            *tsfel_feature_dict_wrapper(get_features_by_domain('statistical')),
        ],
        series_names=["TMP", "ACC_x", "ACC_y"],
        windows=["5min", "15min"],
        strides="5min"
    )
)

fc.calculate(data=[df_tmp, df_acc], return_df=True)

Changes

🎉 The FeatureCollection.calculcate it's feauture-DataFrame output now has a determenistic column order see - #40

tsflex v0.2.2

12 Nov 23:28
Compare
Choose a tag to compare

New features

  • 🔥 Now also supports feature-extraction on numeric-index data (and thus not only time-based data)
  • 💚 Seamless integration with tsfresh, check out the example below:
from tsfresh.feature_extraction import MinimalFCParameters; import scipy.stats as ss

from tsflex.features import FeatureCollection, MultipleFeatureDescriptors
from tsflex.features.integrations import tsfresh_settings_wrapper
from tsflex.utils.data import load_empatica_data

# Load sequence-indexed data (in this case a time-index)
df_tmp, df_acc = load_empatica_data(['tmp', 'acc'])

# Construct your feature extraction configuration & extract features
fc = FeatureCollection(
    MultipleFeatureDescriptors(
        functions=tsfresh_settings_wrapper(MinimalFCParameters()) + [ss.skew],
        series_names=["TMP", "ACC_x", "ACC_y"],
        windows=["5min", "15min"],
        strides="5min"
    )
)

fc.calculate(data=[df_tmp, df_acc], return_df=True)
  • ⚡ Optimized strided-rolling feature-extraction, see the newly generated benchmark ⬇️

image

  • Added FeatureCollection.reduce() which comes in really handy when feature selection is performed in your machine-learning pipeline
  • 🐻 chunk_data() now also supports DataFrame-dicts as input, which can be more convenient when having DataFrames with a lot of columns for which you want to specify the sample-frequencies.
  • 🌻 SeriesPipeline is now more compose-like as it now accepts SeriesPipeline instances

Changes

  • 🧵 Changed pathos ➡️ multiprocess as multiprocessing back-end
  • 🔧 Moved the bound_method argument to FeatureCollection.calculate()
  • 📝 Rewrote strided-rolling back-end in a more OO manner (introduced the segmenter module), which complies with our roadmap of providing more segmenting functionality