You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: metrics.md
+30-1Lines changed: 30 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,14 +1,16 @@
1
1
---
2
-
layout: base
2
+
layout: metrics
3
3
mathjax: true
4
4
permalink: /metrics/
5
5
---
6
6
7
7
# Metrics
8
+
{: .anchor }
8
9
The Solar Forecast Arbiter evaluation framework provides a suite of metrics for evaluating deterministic and probablistic solar forecasts. These metrics are used for different purposes, e.g., comparing the forecast and the measurement, comparing the performance of multiple forecasts, and evaluating an event forecast.
9
10
10
11
11
12
## Metrics for Deterministic Forecasts
13
+
{: .anchor }
12
14
The following metrics provide measures of the performance of deterministic forecasts. Each metric is computed from a set of $$ n $$ forecasts $$ (F_1, F_2, \dots, F_n) $$ and corresponding observations $$ (O_1, O_2, \dots, O_n) $$.
13
15
14
16
In the metrics below, we adopt the following nomenclature:
@@ -20,18 +22,21 @@ In the metrics below, we adopt the following nomenclature:
20
22
21
23
22
24
### Mean Absolute Error (MAE)
25
+
{: .anchor }
23
26
The absolute error is the absolute value of the difference between the forecasted and observed values. The MAE is defined as:
@@ -40,6 +45,7 @@ RMSE is a frequently used measure for evaluating forecast accuracy. Since the er
40
45
41
46
42
47
### Forecast Skill ($$ s $$)
48
+
{: .anchor }
43
49
The forecast skill measures the performance of a forecast relative to a reference forecast. The Solar Forecast Arbiter uses the definition of forecast skill based on RMSE:
44
50
45
51
$$ s = 1 - \frac{\text{RMSE}_f}{\text{RMSE}_{\text{ref}}} $$
@@ -48,18 +54,21 @@ where $$ \text{RMSE}_f $$ is the RMSE of the forecast of interest, and $$ \text{
48
54
49
55
50
56
### Mean Absolute Percentage Error (MAPE)
57
+
{: .anchor }
51
58
The absolute percentage error is the absolute value of the difference between the forecasted and observed values,
@@ -72,13 +81,15 @@ where $$ \sigma_F $$ and $$ \sigma_O $$ are the standard deviations of the forec
72
81
73
82
74
83
### Pearson Correlation Coefficient ($$ r $$)
84
+
{: .anchor }
75
85
Correlation indicates the strength and direction of a linear relationship between two variables. The Pearson correlation coefficient, aka, the sample correlation coefficient, measures the linear dependency between the forecasted and observed values, and is defined as the ratio of the covariance of the variables to the product of their standard deviation:
The coefficient of determination measures the extent that the variability in the forecast errors is explained by variability in the observed values, and is defined as:
@@ -87,6 +98,7 @@ By this definition, a perfect forecast has a $$ R^2 $$ value of 1.
87
98
88
99
89
100
### Kolmogorov-Smirnov Test Integral (KSI)
101
+
{: .anchor }
90
102
The KSI quantifies the level of agreement between the cumulative distribution function (CDFs) of the forecasted and observed values, and is defined as:
@@ -107,6 +119,7 @@ where $$ a_{\text{critical}} = V_c (p_{\text{max}} - p_{\text{min}}) $$ and $$ V
107
119
108
120
109
121
### OVER
122
+
{: .anchor }
110
123
Conceptually, the OVER metric modifies the KSI to quantify the difference between the two CDFs, but only where the CDFs differ by more than a critical limit $$ V_c $$. The OVER is calculated as:
111
124
112
125
$$ OVER = \int_{p_{\text{min}}}^{p_{\text{max}}} D_n^* dp $$
@@ -122,12 +135,14 @@ The OVER metric can be normalized using the same approach as for KSI.
122
135
123
136
124
137
### Combined Performance Index (CPI)
138
+
{: .anchor }
125
139
The CPI can be thought of as a combination of KSI, OVER, and RMSE:
@@ -145,36 +160,42 @@ By then counting the the number of TP, FP, TN and FN values, the following metri
145
160
146
161
147
162
### Probability of Detection (POD)
163
+
{: .anchor }
148
164
The POD is the fraction of observed events correctly forecasted as events:
149
165
150
166
$$ POD = \frac{TP}{TP + FN} $$
151
167
152
168
153
169
### False Alarm Ratio (FAR)
170
+
{: .anchor }
154
171
The FAR is the fraction of forecasted events that did not occur:
155
172
156
173
$$ FAR = \frac{FP}{TP + FP} $$
157
174
158
175
159
176
### Probability of False Detection (POFD)
177
+
{: .anchor }
160
178
The POFD is the fraction of observed non-events that were forecasted as events:
161
179
162
180
$$ POFD = \frac{FP}{FP + TN} $$
163
181
164
182
165
183
### Critical Success Index (CSI)
184
+
{: .anchor }
166
185
The CSI evaluates how well an event forecast predicts observed events, e.g., ramps in irradiance or power. THe CSI is the relative frequency of hits, i.e., how well predicted "yes" events correspond to observed "yes" events:
167
186
168
187
$$ CSI = \frac{TP}{TP + FP + FN} $$
169
188
170
189
171
190
### Event Bias (EBIAS)
191
+
{: .anchor }
172
192
The EBIAS is the ratio of counts of forecast and observed events:
173
193
174
194
$$ EBIAS = \frac{TP + FP}{TP + FN} $$
175
195
176
196
177
197
### Event Accuracy (EA)
198
+
{: .anchor }
178
199
The EA is the fraction of events that were forecasted correctly, i.e., forecast = "yes" and observed = "yes" or forecast = "no" and observed = "no":
@@ -183,6 +204,7 @@ where $$ n $$ is the number of samples.
183
204
184
205
185
206
## Metrics for Probablistic Forecasts
207
+
{: .anchor }
186
208
Probablistic forecasts represent uncertainty in the forecast quantity by providing a probability distribution or a prediction interval, rather than a single value.
187
209
188
210
In the metrics below, we adopt the following nomenclature:
@@ -199,6 +221,7 @@ In the metrics below, we adopt the following nomenclature:
199
221
200
222
201
223
### Brier Score (BS)
224
+
{: .anchor }
202
225
The BS measures the accuracy of forecast probability for one or more events:
@@ -235,6 +261,7 @@ Resolution is the weighted averaged of the squared differences between the relea
235
261
236
262
237
263
### Uncertainty (UNC)
264
+
{: .anchor }
238
265
The UNC is given by:
239
266
240
267
$$ \text{UNC} = \bar{o} (1 - \bar{o})$$
@@ -243,6 +270,7 @@ Uncertainty is the variance of the event indicator $$ o(t) $$. Low values of UNC
243
270
244
271
245
272
### Sharpness (SH)
273
+
{: .anchor }
246
274
The SH represents the degree of "concentration" of a forecast comprising a prediction interval of the form $$ [ f_l, f_u ] $$ within which the forecast quantity is expected to fall with probability $$ 1 - \beta $$. A good forecast should have a low sharpness value. The prediction interval endpoints are associated with quantiles $$ \alpha_l $$ and $$ \alpha_u $$, where $$ \alpha_u - \alpha_l = 1 - \beta $$. For a single prediction interval, the SH is:
The CRPS is a score that is a designed to measure both the realiability and sharpness of a probablistic forecast. For a timeseries of forecasts comprising a CDF at each time point, the CRPS is:
0 commit comments