Closed
Description
Testing forecasting on internal infra data sets has shown up some deficiencies in our modelling from a forecasting perspective. Part of this is that the requirements for forecasting are somewhat different than for anomaly detection. This issue covers the first of set of enhancements aimed at improving forecasting robustness by handling a broader set of data characteristics. However, I think some of the proposed enhances, particularly around dealing with time series which have discontinuities in their values, see issue #6, and combining periodicity tests, will be beneficial for anomaly detection as well. The first set of proposed enhancements in rough order of importance are:
- Create an ensemble of trend models for multiple timescales. We can use our regression models for this with different decay rates. The key advantage of this is that when it comes to forecasting we have a natural way to adjust the weights of the different models the further out we look. This means our predictions will revert to the trend for that time scale.
- A new trend model suitable for forecasting
- Wire in new trend model
- New style forecasting plus fix unit tests and other fallout from previous changes
- Combine our diurnal and arbitrary periodicity test. The sketch data structure which we use for the arbitrary periodicity test can be repurposed for testing for diurnal periodicity, weekend/weekday splits, etc, as well. Since we can fit and remove a trend from the sketched values before testing, we can decouple periodicity testing from any modelling we do of the trend. This means we can start modelling a trend component immediately. This is important for forecasting, because otherwise it is easy to produce obviously bad looking forecasts.
- Prepare the way for unified periodicity testing
- Wire in new periodicity testing
Before merging these changes to master we need to:
- Review results comparison results (check anomaly results and model size).
- Upgrade state from earlier versions