Skip to content

[ML] Change point detection and prediction #92

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 41 commits into from
May 11, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
b8fa21c
Better naming of decomposition methods and related variables
tveasey Mar 9, 2018
91c0057
More consistent naming in CTimeSeriesModel and avoid lots of overload…
tveasey Mar 9, 2018
eac04e9
[ML] First pass implementation of support functionality for change de…
tveasey Mar 12, 2018
4052390
Factor out some convenience functionality. Switch to using std versio…
tveasey Mar 12, 2018
10d4912
[ML] Wire in change detection/modelling to our univariate time series…
tveasey Mar 15, 2018
7202a17
Less confusing naming of the anomaly score calculation
tveasey Mar 15, 2018
6b420dc
Fix name change fallout
tveasey Mar 20, 2018
69a96ef
Switch to (more standard) logistic function
tveasey Mar 20, 2018
4bc1e53
Tidy up expectation w.r.t. marginal likelihood
tveasey Mar 21, 2018
1bc86c6
Merge branch 'master' into feature/forecast-enhancements-part-2
tveasey Mar 23, 2018
893f0b2
Bad merge
tveasey Mar 23, 2018
447f31a
[ML] Implements an absolute goodness-of-fit test to accept a change …
tveasey Mar 26, 2018
e763bc3
[ML] Linear scaling change detection (#25)
tveasey Apr 4, 2018
96bcb62
Fix windows build issue
tveasey Apr 4, 2018
3ec6deb
Reference to weight styles outlives the object
tveasey Apr 4, 2018
3872822
Merge commit 'd4e4cca70edae4500cc1535a2da582935074a25b' into feature/…
tveasey Apr 13, 2018
57b51c2
Reformat
tveasey Apr 13, 2018
a0d0836
Merge commit '2580b4fc3068699fc8f2c61fbb7b85125f756e0b' into feature/…
tveasey Apr 13, 2018
6debc50
Merge commit '1347a65c807f8e9e7d80c5af8e6d2fd6bb176e1f' into feature/…
tveasey Apr 13, 2018
d616779
Reformat
tveasey Apr 13, 2018
b26e68d
Merge commit '3dadf1c2fe15507fde675ac64e7ff668ef9a014c' into feature/…
tveasey Apr 13, 2018
f09fcb9
Merge branch 'master' into feature/forecast-enhancements-part-2
tveasey Apr 13, 2018
77c6fd8
Fixing format merge
tveasey Apr 13, 2018
8665f50
More fixing of the format merge
tveasey Apr 13, 2018
9a6e471
Switch to std shared pointers
tveasey Apr 13, 2018
3fb0911
Fix fallout from merge
tveasey Apr 16, 2018
5a250d1
Merge master
tveasey Apr 23, 2018
8c64117
Merge branch 'master' into feature/forecast-enhancements-part-2
tveasey Apr 23, 2018
705a6a4
C++11 style changes for clustering code
tveasey Apr 24, 2018
ff82513
Merge branch 'master' into feature/forecast-enhancements-part-2
tveasey Apr 26, 2018
2aa457b
Fix unit test
tveasey Apr 26, 2018
e478892
Merge branch 'master' into feature/forecast-enhancements-part-2
tveasey May 3, 2018
bb83c5e
Improve anomaly sign calculation
tveasey May 4, 2018
d0ae51c
Improve anomaly sign calculation
tveasey May 4, 2018
5fb47b3
Merge branch 'master' into feature/forecast-enhancements-part-2
tveasey May 9, 2018
c08506a
Merge branch 'feature/forecast-enhancements-part-2' of github.com:ela…
tveasey May 9, 2018
0597495
Merge branch 'master' into feature/forecast-enhancements-part-2
tveasey May 9, 2018
0229047
Standardise on CTools pow2
tveasey May 9, 2018
f3e0372
Fix tests and formatting
tveasey May 9, 2018
b54c408
Add change log entry
tveasey May 11, 2018
1ae79ab
Formatting fix
tveasey May 11, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/CHANGELOG.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@

Improve and use periodic boundary condition for seasonal component modeling ({pull}84[#84])
Improve robustness w.r.t. outliers of detection and initialisation of seasonal components ({pull}90[#90])
Explicit change point detection and modelling ({pull}92[#92])

=== Bug Fixes

Expand Down
2 changes: 1 addition & 1 deletion include/core/CContainerPrinter.h
Original file line number Diff line number Diff line change
Expand Up @@ -274,7 +274,7 @@ class CORE_EXPORT CContainerPrinter : private CNonInstantiatable {
return *value;
}

//! Print a boost::shared_pointer.
//! Print a std::shared_pointer.
template<typename T>
static std::string printElement(const std::shared_ptr<T>& value) {
if (value == std::shared_ptr<T>()) {
Expand Down
1 change: 0 additions & 1 deletion include/core/CRapidJsonWriterBase.h
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,6 @@ class CRapidJsonWriterBase
using TValue = rapidjson::Value;
using TDocumentWeakPtr = std::weak_ptr<TDocument>;
using TValuePtr = std::shared_ptr<TValue>;

using TPoolAllocatorPtr = std::shared_ptr<CRapidJsonPoolAllocator>;
using TPoolAllocatorPtrStack = std::stack<TPoolAllocatorPtr>;
using TStrPoolAllocatorPtrMap = boost::unordered_map<std::string, TPoolAllocatorPtr>;
Expand Down
25 changes: 14 additions & 11 deletions include/core/Constants.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,38 +16,41 @@ namespace ml {
namespace core {
namespace constants {

//! A minute in seconds.
const core_t::TTime MINUTE{60};

//! An hour in seconds.
const core_t::TTime HOUR = 3600;
const core_t::TTime HOUR{3600};

//! A day in seconds.
const core_t::TTime DAY = 86400;
const core_t::TTime DAY{86400};

//! A (two day) weekend in seconds.
const core_t::TTime WEEKEND = 172800;
const core_t::TTime WEEKEND{172800};

//! Five weekdays in seconds.
const core_t::TTime WEEKDAYS = 432000;
const core_t::TTime WEEKDAYS{432000};

//! A week in seconds.
const core_t::TTime WEEK = 604800;
const core_t::TTime WEEK{604800};

//! A (364 day) year in seconds.
const core_t::TTime YEAR = 31449600;
const core_t::TTime YEAR{31449600};

//! Log of min double.
const double LOG_MIN_DOUBLE = std::log(std::numeric_limits<double>::min());
const double LOG_MIN_DOUBLE{std::log(std::numeric_limits<double>::min())};

//! Log of max double.
const double LOG_MAX_DOUBLE = std::log(std::numeric_limits<double>::max());
const double LOG_MAX_DOUBLE{std::log(std::numeric_limits<double>::max())};

//! Log of double epsilon.
const double LOG_DOUBLE_EPSILON = std::log(std::numeric_limits<double>::epsilon());
const double LOG_DOUBLE_EPSILON{std::log(std::numeric_limits<double>::epsilon())};

//! Log of two.
const double LOG_TWO = 0.693147180559945;
const double LOG_TWO{0.693147180559945};

//! Log of two pi.
const double LOG_TWO_PI = 1.83787706640935;
const double LOG_TWO_PI{1.83787706640935};

#ifdef Windows
const char PATH_SEPARATOR = '\\';
Expand Down
8 changes: 8 additions & 0 deletions include/maths/CBasicStatistics.h
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,14 @@ class MATHS_EXPORT CBasicStatistics {
}
}

//! Update the moments with the collection \p x.
template<typename U, std::size_t N>
void add(const core::CSmallVector<U, N>& x) {
for (const auto& xi : x) {
this->add(xi);
}
}

//! Update the moments with the collection \p x.
template<typename U>
void add(const std::vector<SSampleCentralMoments<U, ORDER>>& x) {
Expand Down
3 changes: 3 additions & 0 deletions include/maths/CCalendarComponent.h
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,9 @@ class MATHS_EXPORT CCalendarComponent : private CDecompositionComponent {
//! Clear all data.
void clear();

//! Linearly scale the component's by \p scale.
void linearScale(core_t::TTime time, double scale);

//! Adds a value \f$(t, f(t))\f$ to this component.
//!
//! \param[in] time The time of the point.
Expand Down
3 changes: 3 additions & 0 deletions include/maths/CCalendarComponentAdaptiveBucketing.h
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,9 @@ class MATHS_EXPORT CCalendarComponentAdaptiveBucketing : private CAdaptiveBucket
//! allocated memory.
void clear();

//! Linearly scale the bucket values by \p scale.
void linearScale(double scale);

//! Add the function value at \p time.
//!
//! \param[in] time The time of \p value.
Expand Down
28 changes: 21 additions & 7 deletions include/maths/CModel.h
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,11 @@ using TForecastPushDatapointFunc = std::function<void(SErrorBar)>;
class MATHS_EXPORT CModelParams {
public:
CModelParams(core_t::TTime bucketLength,
const double& learnRate,
const double& decayRate,
double minimumSeasonalVarianceScale);
double learnRate,
double decayRate,
double minimumSeasonalVarianceScale,
core_t::TTime minimumTimeToDetectChange,
core_t::TTime maximumTimeToTestForChange);

//! Get the bucket length.
core_t::TTime bucketLength() const;
Expand All @@ -63,6 +65,15 @@ class MATHS_EXPORT CModelParams {
//! Get the minimum seasonal variance scale.
double minimumSeasonalVarianceScale() const;

//! Check if we should start testing for a change point in the model.
bool testForChange(core_t::TTime changeInterval) const;

//! Get the minimum time to detect a change point in the model.
core_t::TTime minimumTimeToDetectChange(void) const;

//! Get the maximum time to test for a change point in the model.
core_t::TTime maximumTimeToTestForChange(void) const;

//! Set the probability that the bucket will be empty for the model.
void probabilityBucketEmpty(double probability);

Expand All @@ -78,6 +89,10 @@ class MATHS_EXPORT CModelParams {
double m_DecayRate;
//! The minimum seasonal variance scale.
double m_MinimumSeasonalVarianceScale;
//! The minimum time permitted to detect a change in the model.
core_t::TTime m_MinimumTimeToDetectChange;
//! The maximum time permitted to test for a change in the model.
core_t::TTime m_MaximumTimeToTestForChange;
//! The probability that a bucket will be empty for the model.
double m_ProbabilityBucketEmpty;
};
Expand All @@ -90,8 +105,6 @@ class MATHS_EXPORT CModelAddSamplesParams {

public:
CModelAddSamplesParams();
CModelAddSamplesParams(const CModelAddSamplesParams&) = delete;
const CModelAddSamplesParams& operator=(const CModelAddSamplesParams&) = delete;

//! Set whether or not the data are integer valued.
CModelAddSamplesParams& integer(bool integer);
Expand Down Expand Up @@ -145,8 +158,6 @@ class MATHS_EXPORT CModelProbabilityParams {

public:
CModelProbabilityParams();
CModelProbabilityParams(const CModelAddSamplesParams&) = delete;
const CModelProbabilityParams& operator=(const CModelAddSamplesParams&) = delete;

//! Set the tag for the entity for which to compute the probability.
CModelProbabilityParams& tag(std::size_t tag);
Expand Down Expand Up @@ -254,6 +265,9 @@ class MATHS_EXPORT CModel {
E_Reset //!< Model reset.
};

//! Combine the results \p lhs and \p rhs.
static EUpdateResult combine(EUpdateResult lhs, EUpdateResult rhs);

public:
CModel(const CModelParams& params);
virtual ~CModel() = default;
Expand Down
Loading