You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
usually for GTS and HTS, the time series for all groups are in the same dataframe, requiring the shift operations to be performed in a groupwise fashion.
there might be some missing data for specific days, which makes the naive operation for lagged (using pd.shift) features incossistent.
you may want to create resampled (downsampled) lagged features, preserving the default data frequency (like having daily prediciton using last week mean as a feature, for instance)
I already have this implemented as a function.
convince us of the use-case, we're open to many suggestions but we prefer to solve problems with pipelines that are at least somewhat general
add a screenshot if applicable (ML stuff is hard to explain with words, pictures say 1000 words)
make sure that the feature you want is not already supported by sklearn
The text was updated successfully, but these errors were encountered:
AlanGanem
changed the title
Datetime Index features for sklego.pandas_utils.add_lags [FEATURE]
Datetime Index support for sklego.pandas_utils.add_lags [FEATURE]
Dec 22, 2020
This sounds like an imputation combined with our GroupedTransformer. I'm not 100% sure if the transformer has any notion of hierarchy. I do know that our grouped predictor does have this feature, see shrinkage param in API.
It indeed resembles part of the GroupedTransformer funcitonality, although i'm not sure how that'd work with "callable transformers" instead of objects containning fit and transform methods (maybe a simple wrapper would suffice in this case).
Still, the remainning features i believe are not avalible in the current implementation of sklego.pandas_utils.add_lags are:
The imputation part is a bit different, since the inputations are row inputations (non existing date rows) and not value inputation (in the sense of filling out the NaNs). Resampling to a desired frequency and filling the gaps does the job.
There's also the different time frequency (downsampled) features. I'm not sure if that'd be generally usefull, but it did help me a lot in time series feature engineering, alongside with rolling operations.
About the shrinkage parameter, it looks very intresting! I've just mentioned Hierarchichal Time series as a use case for the feature request. It does not explicitly handle hierarchies, only the hierarchy "leaves" are naturally taken into account by the groupwise tranformation.
Please explain clearly what you'd like to see added.
When working with Hierarchichal and Groupped time series, i've stumbled uppon some common issues:
I already have this implemented as a function.
The text was updated successfully, but these errors were encountered: