Skip to content

[ML] Improve robustness w.r.t. outliers of detection and initialisation of seasonal components #90

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

tveasey
Copy link
Contributor

@tveasey tveasey commented May 4, 2018

This makes two principle changes:

  1. Iteratively reweights outliers w.r.t. the seasonal component under test and for initialisation. These are defined as a fraction of values with highest residual w.r.t. the component's predictions.
  2. Switches marginal test decisions for decomposition components to use logistic regression on top of the various factors, i.e. variance reduction, autocorrelation, number of periods of data observed, etc.

I've tested this on a variety of synthetic and real examples where initial periodic patterns are distorted by outliers and this approach has proved effective, see #87 for more details. Otherwise, it seems to have no detrimental effects.

This will affect results on count and metric analyses when there is seasonality and significant distortion due to outliers.

@tveasey tveasey force-pushed the enhancement/improve-periodicity-test-robustness branch from 005641a to 0792d04 Compare May 4, 2018 16:45
Copy link
Contributor

@edsavage edsavage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your changes look good to me Tom.

@hendrikmuhs
Copy link

Please additionally do

git cherry-pick 18ebdd67f7ba2b7f861607e248d79e1cca8c7891

when backporting this to 6.x

tveasey added a commit that referenced this pull request May 22, 2018
…on of seasonal components (#90)

This makes two principle changes: 1) iteratively reweights outliers w.r.t. the seasonal component 
under test and for initialisation. These are defined as a fraction of values with highest residual 
w.r.t. the component's predictions. 2) switches marginal test decisions for decomposition 
components to use logistic regression on top of the various factors, i.e. variance reduction, 
autocorrelation, number of periods of data observed, etc.
@tveasey tveasey deleted the enhancement/improve-periodicity-test-robustness branch March 22, 2019 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants