Easily evaluate your forecasts with (multivariate) Diebold-Mariano or (multivariate) Giacomini-White tests of Equal (Conditional) Predictive Ability and MCS. A procedure reducing a pool of candidates to a best final set with equal predictive ability is also proposed.
The multivariate Giacomini-White test (MGW) (Borup et al., 2022) is the generalization of the Giacomini-White test (
GW) (Giacomini and White, 2006), which in turn is a generalization of the famous Diebold-Mariano (DM) (Diebold and
Mariano, 1995) test. Basically, it allows to do a test of Conditional Predictive Ability instead of just Equal
Predictive Ability (like the DM does). It's an asymptotic
where
See the references for more information.
A small interpretation note: having a small p-value (usually smaller than
The neat thing about the MGW test is that it reduces to:
- the univariate GW test (Giacomini and White, 2006) when comparing
$k = 2$ methods with potential conditiong; - the multivariate DM test (Mariano and Preve, 2012) when comparing
$k > 2$ methods without conditioning; - the univariate DM test (Diebold and Mariano, 1995) when comparing
$k = 2$ methods without conditioning.
Hence, it can be used in those 3 cases.
However, the tests give no indication as to which method being compared is the best. That is done by the MCS, adapted from Hansen (2011). See the examples below.
Tested from python >=3.9.
# Create virtual environment
> python -m venv venv
# Activate it
> source ./venv/bin/activate
# Pip install from git
> pip install git+https://github.com/ogrnz/feval
# If already installed, you can also upgrade it
> pip install --upgrade git+https://github.com/ogrnz/feval
Those examples are here to illustrate how to use the package.
Unconditional MGW example:
The forecasts are equally good, on average, and hence have the same predictive ability.
import numpy as np
from feval import helpers # to easily compute losses
from feval import mgw
T = 100
F = np.vstack([np.random.rand(T), np.random.rand(T), np.random.rand(T)]).T # random uniform forecasts [0,1)
y = np.zeros(T) + 0.5 # Target
L = helpers.ae(y, F) # Absolute loss
S, cval, pval = mgw(L) # Perform the test with default values
# We should get a large p-value, since both forecasts are equally bad at predicting y
print(pval) # 0.61 (exact value can change due to randomness)
# As expected, the null of equal predictive ability is not rejected,
# we cannot say that a model is better than another
Unconditional CMCS example: The best forecast is the 3rd one, it should be the last one in the best set.
import numpy as np
from feval import helpers # to easily compute losses
from feval import cmcs
T = 100
F = np.vstack(
[np.random.rand(T) + 0.5,
np.random.rand(T) + 1.0,
np.random.rand(T), # Only this forecast is 'good'
np.random.rand(T) - 0.3]).T
y = np.zeros(T) + 0.5 # Target
L = helpers.se(y, F) # Squared loss
# Perform the cmcs with an HAC estimator, the Bartlett kernel and a significance level of 0.01
mcs, S, cval, pval, removed = cmcs(L, alpha=0.01, covar_style="hac", kernel="Bartlett")
print(mcs) # [0, 0, 1, 0], only the 3rd model is included in the best set
Conditional MCS: 1st and 3rd forecasts are equally good, while the others are biased. Here, the use of instruments is useless, but serves as an illustration.
import numpy as np
from feval import helpers # to easily compute losses
from feval import cmcs
# Conditional MCS
T = 101 # Set 1 more to allow 1 lag computation as instrument
F = np.vstack(
[np.random.rand(T), # This forecast is 'good'
np.random.rand(T) + 0.5,
np.random.rand(T), # This forecast is 'good'
np.random.rand(T) - 0.5]).T
y = np.zeros(T) + 0.5 # Target
L = helpers.se(y, F) # Squared loss
# Compute instruments as lags of loss differences
# Instruments useless here, but to illustrate its use
D = np.diff(L, axis=1)
D = np.roll(D, 1, axis=0)[:-1]
H = np.vstack([np.ones(T - 1), D.T]).T # Instruments, a constant + lags of loss differences
# Perform the cmcs with an HAC estimator with Parzen kernel
mcs, S, cval, pval, removed = cmcs(L[:-1, :], H=H, covar_style="hac", kernel="Parzen")
print(mcs) # [1, 0, 1, 0], only the 1st and 3rd models are included in the best set
# Create virtual environment
> python -m venv venv
# Activate it
> source ./venv/bin/activate
# Clone the repo
> git clone https://github.com/ogrnz/feval
# Install it in editable mode for your user
> pip install -U --editable feval
Don't forget to test your code with the scripts in ./tests
!
- The statistical tests have been "translated" from their matlab code and unified under a single API;
- To keep minimal requirements, the tests handle numpy arrays only
- HAC covariance estimators are computed by the Arch package, but can be computed by any Callable.
- Borup, Daniel and Eriksen, Jonas Nygaard and Kjær, Mads Markvart and Thyrsgaard, Martin, Predicting Bond Return Predictability.
- Diebold, F.X., and R.S. Mariano (1995) ‘Comparing Predictive Accuracy,’ Journal of Business and Economic Statistics 13, 253–263.
- Giacomini, R., & White, H. (2006) . Tests of conditional predictive ability . Econometrica, 74(6), 1545-1578.
- Hansen, P. R., Lunde, A., & Nason, J. M. (2011) . The model confidence set . Econometrica, 79(2), 453-497.
- Mariano, R. S., & Preve, D. (2012) . Statistical tests for multiple forecast comparison . Journal of econometrics, 169(1), 123-130.