Closed
Description
Currently time series forecasting framework and API is a standalone entity. We must change that to be an estimator so that it fit seamlessly into the training pipeline. Do the following:
- Make SSA forecasting an estimator API.
- Create a fit(Idataview) that is used for training the forecasting model.
- Create a fit(IDataView, SSAModel) that is used for updating the forecasting model from a column in the IDataView
- Create a Transform(IDataView) that forecasts values up a horizon that is read from a column-row in the IDataView. Here forecasted values are represented as a variable sized vector in the column.
- Create a Transform() that forecasts values as a new IDataView where each forecasted value is its own row.
on 6/21 myself, @eerhardt and @ganik agreed on the below design which @artidoro also agrees with
- Make SSA forecasting an estimator API that trains on an input column and produces output columns for forecast and min/max confidence intervals.
- Transform(IDataView) call will forecast values while reading values from the input column and updating the forecast model. This means "I forecast values after having seen this new value in the input column." this is good for say real-time stock prediction. While the forecast model gets updated in real time but it cannot be saved to disk for later use with updated values. This will be useful for rolling-CV time series elevator.
3) Expose forecasting prediction engine from time series prediction that allows to forecast values and also allows changing forecasting model parameters such as horizon and confidence intervals. It allows to forecast values without feeding in any value, allows to update the model with new observations and saving the updated the model to disk at any time. - Enhance TimeSeriesPrediction to handle anomaly detection and forecasting seamlessly and provide experience very close to that of regular prediction engine. This can by achieved by creating following variants of Predict()
- TDst Predict(int? horizon = null, float? confidenceLevel = null): Forecasts based on output columns specified in TDst.
- TDst Predict(TSrc input, int? horizon = null, float? confidenceLevel = null): Updates the model with input and then predicts on TDst, here predict could be anomaly detection or forecasting.
- void Predict(TSrc input, ref TDst output, int? horizon = null, float? confidenceLevel = null) : if input is null it means its a forecasting task, if output is null it means it is an update, both are null nothing is done, if both are present, model is updated and prediction is made on the output columns. This function is used to override the void Predict(TSrc input, ref TDst output) that is exposed by base prediction engine class.