Skip to content

[ML] Improve change point detection for long bucket lengths #95

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

tveasey
Copy link
Contributor

@tveasey tveasey commented May 16, 2018

This PR does some hardening to change detection based on further testing, particularly for longer bucket lengths. By way of example, the following is the current behaviour on a problematic data set:
30m bucketing
screen shot 2018-05-16 at 12 59 48
1h bucketing
screen shot 2018-05-16 at 13 03 50
2h bucketing
screen shot 2018-05-16 at 13 01 40

After these changes we get:
30m bucketing
screen shot 2018-05-16 at 12 21 55
1h bucketing
screen shot 2018-05-16 at 12 20 46
2h bucketing
screen shot 2018-05-16 at 12 21 25

I have committed these changes incrementally in logical groups, so it may be easier to review them a commit at a time.

The following are made in the last commit:

  1. Improve the methodology for deciding if we'll eventually accept a change. Specifically, we can observe all the values for the decision function over the change detection window and use these to predict whether we'll eventually accept the change. We use this to:
    • Better adjust the weights of values added to the trend during change period,
    • Better decide if it is worth continuing to test.
  2. Improve the decision to clear the data structures we use for periodicity testing based on the historical data characteristics and the nature of the change.
  3. Require a longer period of unusual values to trigger change detection. I've traded this for a slightly higher threshold on the individual p-values. There are downsides to testing for a change: i) it temporarily increases the model memory size, ii) we don't have concurrent change detection, so if the test is already running, it can interfere with detection if the test window overlaps a real change point. On balance, it is therefore better to wait slightly longer to see if the event is short-lived before committing to run the full test.
  4. Decrease the time we'll take to detect a change for long bucket lengths. We can get acceptable test accuracy observing fewer values and with the current parameterisation we were waiting too long for a common case of scaling daily periodicity if the job used long bucket lengths.

Making these changes also uncovered an improvement to make to periodicity testing following on from #90. Specifically, we should have been weighting the samples when computing the level for the amplitude test to better deal with outliers. This is made in the first commit.

This can affect results for data with change points particularly if using longer bucket lengths. Marking as a non-issue since this is a refinement to an unreleased version.

double p{CTools::logisticFunction(x[0], 0.05, 1.0) *
CTools::logisticFunction(x[1], 0.1, 1.0) *
(x[2] < 0.0 ? 1.0 : CTools::logisticFunction(x[2], 0.2, 1.0)) *
CTools::logisticFunction(x[3], 0.2, 0.5)};
LOG_TRACE("p(" << (*candidates[0].second)->change()->print() << ") = " << p
<< " | x = " << core::CContainerPrinter::print(x));
LOG_TRACE(<< "df(" << (*candidates[0].second)->change()->print() << ") = " << p / 0.03125
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make it clearer to the reader to use MAXIMUM_DECISION_FUNCTION * p rather than p / 0.03125 even though they evaluate to the same thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed

m_Windows[test]->initialize(time);
void CTimeSeriesDecompositionDetail::CPeriodicityTest::maybeClear(core_t::TTime time,
double shift) {
for (auto& test : {E_Short, E_Long}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it just be auto for iterating simple enum values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an oversight from originally iterating over the windows rather than the enums

values.push_back(CBasicStatistics::mean(value));
}
}
if (shift > 1.4826 * CBasicStatistics::mad(values)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic 1.4826 could be a symbolic constant or at least commented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This corrects for bias when estimating the standard deviation (for normal data). I've pulled out the constant as you suggest.

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tveasey tveasey merged commit 943e3cf into elastic:master May 25, 2018
tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request May 29, 2018
@sophiec20 sophiec20 added :ml and removed :ml labels Jun 12, 2018
hendrikmuhs pushed a commit that referenced this pull request Jul 19, 2018
do not clear periodicity test after detecting a changepoint: Internal QA test revealed a problem with
deleting the periodicity test after detecting a changepoint, causing a regression in an internal dataset. 
This fix removes the clearance part of changepoint detection (only a small part of cp detection, see 
#95), effectively restoring the old behavior.
hendrikmuhs pushed a commit to hendrikmuhs/ml-cpp that referenced this pull request Jul 20, 2018
…stic#159)

do not clear periodicity test after detecting a changepoint: Internal QA test revealed a problem with
deleting the periodicity test after detecting a changepoint, causing a regression in an internal dataset. 
This fix removes the clearance part of changepoint detection (only a small part of cp detection, see 
elastic#95), effectively restoring the old behavior.
hendrikmuhs pushed a commit that referenced this pull request Jul 20, 2018
…#160)

do not clear periodicity test after detecting a changepoint: Internal QA test revealed a problem with 
deleting the periodicity test after detecting a changepoint, causing a regression in an internal dataset. 
This fix removes the clearance part of changepoint detection (only a small part of cp detection, see 
#95), effectively restoring the old behavior.
@tveasey tveasey deleted the enhancement/long-bucket-change-detection branch May 1, 2019 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants