Skip to content

Commit a096491

Browse files
lboemanwholmgren
authored andcommitted
add sidebar to metrics (#76)
* add sidebar to metrics * add mathjax to the metrics layout
1 parent adf5e4b commit a096491

File tree

3 files changed

+103
-1
lines changed

3 files changed

+103
-1
lines changed

_includes/metrics_sidebar.html

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
<div class="contents-menu">
2+
<h2>Contents</h2>
3+
<ul class="contents-list">
4+
<li><a href="#metrics">Introduction</a></li>
5+
<li> <a href="#metrics-for-deterministic-forecasts">Metrics for Deterministic Forecasts</li>
6+
<ol type="A">
7+
<li><a href="#mean-absolute-error-mae">Mean Absolute Error (MAE)</a></li>
8+
<li><a href="#mean-bias-error-mbe">Mean Bias Error (MBE)</a></li>
9+
<li><a href="#root-mean-square-error-rmse">Root Mean Square Error (RMSE)</a></li>
10+
<li><a href="#forecast-skill-s">Forecast Skill </a></li>
11+
<li><a href="#mean-absolute-percentage-error-mape">Mean Absolute Percentage Error (MAPE)</a></li>
12+
<li><a href="#normalized-root-mean-square-error-nrmse">Normalized Root Mean Square Error (NRMSE):</a></li>
13+
<li><a href="#centered-unbiased-root-mean-square-error-crmse">Centered (unbiased) Root Mean Square Error (CRMSE)</a></li>
14+
<li><a href="#pearson-correlation-coefficient-r">Pearson Correlation Coefficient</a></li>
15+
<li><a href="#coefficient-of-determination-r2">Coefficient of Determination </a></li>
16+
<li><a href="#kolmogorov-smirnov-test-integral-ksi">Kolmogorov-Smirnov Test Integral (KSI)</a></li>
17+
<li><a href="#over">OVER</a></li>
18+
<li><a href="#combined-performance-index-cpi">Combined Performance Index (CPI)</a></li>
19+
</ol>
20+
<li><a href="#metrics-for-deterministic-event-forecasts">Metrics for Deterministic Forecast Events</li>
21+
<ol type="A">
22+
<li><a href="#probability-of-detection-pod">Probability of Detection (POD)</a></li>
23+
<li><a href="#false-alarm-ratio-far">False Alarm Ratio (FAR)</a></li>
24+
<li><a href="#probability-of-false-detection-pofd">Probability of False Detection (POFD)</a></li>
25+
<li><a href="#critical-success-index-csi">Critical Success Index (CSI)</a></li>
26+
<li><a href="#event-bias-ebias">Event Bias (EBIAS)</a></li>
27+
<li><a href="#event-accuracy-ea">Event Accuracy (EA)</a></li>
28+
</ol>
29+
<li><a href="#metrics-for-probablistic-forecasts">Metrics for Probablistic Forecasts</li>
30+
<ol type="A">
31+
<li><a href="#brier-score-bs">Brier Score (BS)</a></li>
32+
<li><a href="#brier-skill-score-bss">Brier Skill Score (BSS)</a></li>
33+
<li><a href="#reliability-rel">Reliability (REL)</a></li>
34+
<li><a href="#resolution-res">Resolution (RES)</a></li>
35+
<li><a href="#uncertainty-unc">Uncertainty (UNC)</a></li>
36+
<li><a href="#sharpness-sh">Sharpness (SH)</a></li>
37+
<li><a href="#continuous-ranked-probability-score-crps">Continuous Ranked Probability Score (CRPS)</a></li>
38+
</ol>
39+
</ul>
40+
</div>
41+

_layouts/metrics.html

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
<!doctype html>
2+
<html lang="en" class="h-100">
3+
<head>
4+
{% include header.html %}
5+
{% if page.mathjax != false %}
6+
{% include mathjax.html %}
7+
{% endif %}
8+
</head>
9+
10+
<body class="d-flex flex-column h-100">
11+
<header>
12+
{% include navbar.html %}
13+
</header>
14+
<!-- Begin page content -->
15+
<main role="main" class="flex-shrink-0">
16+
<div class="container">
17+
<div class="sidebar">
18+
{% include metrics_sidebar.html %}
19+
</div>
20+
<div class="content-wrapper-sidebar">
21+
{{ content }}
22+
</div>
23+
</div>
24+
</main>
25+
26+
<footer class="footer mt-auto py-3">
27+
<div class="container">
28+
{% include footer.html %}
29+
</div>
30+
</footer>
31+
</body>
32+
</html>

metrics.md

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,16 @@
11
---
2-
layout: base
2+
layout: metrics
33
mathjax: true
44
permalink: /metrics/
55
---
66

77
# Metrics
8+
{: .anchor }
89
The Solar Forecast Arbiter evaluation framework provides a suite of metrics for evaluating deterministic and probablistic solar forecasts. These metrics are used for different purposes, e.g., comparing the forecast and the measurement, comparing the performance of multiple forecasts, and evaluating an event forecast.
910

1011

1112
## Metrics for Deterministic Forecasts
13+
{: .anchor }
1214
The following metrics provide measures of the performance of deterministic forecasts. Each metric is computed from a set of $$ n $$ forecasts $$ (F_1, F_2, \dots, F_n) $$ and corresponding observations $$ (O_1, O_2, \dots, O_n) $$.
1315

1416
In the metrics below, we adopt the following nomenclature:
@@ -20,18 +22,21 @@ In the metrics below, we adopt the following nomenclature:
2022

2123

2224
### Mean Absolute Error (MAE)
25+
{: .anchor }
2326
The absolute error is the absolute value of the difference between the forecasted and observed values. The MAE is defined as:
2427

2528
$$ \text{MAE} = \frac{1}{n} \sum_{i=1}^n \lvert F_i - O_i \rvert $$
2629

2730

2831
### Mean Bias Error (MBE)
32+
{: .anchor }
2933
The bias is the difference between the forecasted and observed values. The MBE is defined as:
3034

3135
$$ \text{MBE} = \frac{1}{n} \sum_{i=1}^n (F_i - O_i) $$
3236

3337

3438
### Root Mean Square Error (RMSE)
39+
{: .anchor }
3540
The RMSE is the square root of the averaged of the squared differences between the forecasted and observed values, and is defined as:
3641

3742
$$ \text{RMSE} = \sqrt{ \frac{1}{n} \sum_{i=1}^n (F_i - O_i)^2 } $$
@@ -40,6 +45,7 @@ RMSE is a frequently used measure for evaluating forecast accuracy. Since the er
4045

4146

4247
### Forecast Skill ($$ s $$)
48+
{: .anchor }
4349
The forecast skill measures the performance of a forecast relative to a reference forecast. The Solar Forecast Arbiter uses the definition of forecast skill based on RMSE:
4450

4551
$$ s = 1 - \frac{\text{RMSE}_f}{\text{RMSE}_{\text{ref}}} $$
@@ -48,18 +54,21 @@ where $$ \text{RMSE}_f $$ is the RMSE of the forecast of interest, and $$ \text{
4854

4955

5056
### Mean Absolute Percentage Error (MAPE)
57+
{: .anchor }
5158
The absolute percentage error is the absolute value of the difference between the forecasted and observed values,
5259

5360
$$ \text{MAPE} = 100\% \cdot \frac{1}{n} \sum_{i=1}^n | \frac{F_i - O_i}{O_i} | $$
5461

5562

5663
### Normalized Root Mean Square Error (NRMSE):
64+
{: .anchor }
5765
The NRMSE [%] is the normalized form of the RMSE and is defined as:
5866

5967
$$ \text{RMSE} = \frac{100\%}{\text{norm}} \cdot \sqrt{ \frac{1}{n} \sum_{i=1}^n (F_i - O_i)^2 } $$
6068

6169

6270
### Centered (unbiased) Root Mean Square Error (CRMSE)
71+
{: .anchor }
6372
The CRMSE describes the variation in errors around the mean and is defined as:
6473

6574
$$ \text{CRMSE} = \sqrt{ \frac{1}{n} \sum_{i=1}^n \left( (F_i - \bar{F}) - (O_i - \bar{O}) \right)^2 } $$
@@ -72,13 +81,15 @@ where $$ \sigma_F $$ and $$ \sigma_O $$ are the standard deviations of the forec
7281

7382

7483
### Pearson Correlation Coefficient ($$ r $$)
84+
{: .anchor }
7585
Correlation indicates the strength and direction of a linear relationship between two variables. The Pearson correlation coefficient, aka, the sample correlation coefficient, measures the linear dependency between the forecasted and observed values, and is defined as the ratio of the covariance of the variables to the product of their standard deviation:
7686

7787
$$ r = \frac{ \sum_{i=1}^n (F_i - \bar{F}) (O_i - \bar{O}) }{
7888
\sqrt{ \sum_{i=1}^n (F_i - \bar{F})^2} \times \sqrt{ \sum_{i=1}^n (O_i - \bar{O})^2 } } $$
7989

8090

8191
### Coefficient of Determination ($$ R^2 $$)
92+
{: .anchor }
8293
The coefficient of determination measures the extent that the variability in the forecast errors is explained by variability in the observed values, and is defined as:
8394

8495
$$ R^2 = 1 - \frac{ \sum_{i=1}^n (O_i - F_i)^2 }{ \sum_{i=1}^n (O_i - \bar{O})^2 } $$
@@ -87,6 +98,7 @@ By this definition, a perfect forecast has a $$ R^2 $$ value of 1.
8798

8899

89100
### Kolmogorov-Smirnov Test Integral (KSI)
101+
{: .anchor }
90102
The KSI quantifies the level of agreement between the cumulative distribution function (CDFs) of the forecasted and observed values, and is defined as:
91103

92104
$$ \text{KSI} = \int_{p_{\text{min}}}^{p_{\text{max}}} D_n(p) dp $$
@@ -107,6 +119,7 @@ where $$ a_{\text{critical}} = V_c (p_{\text{max}} - p_{\text{min}}) $$ and $$ V
107119

108120

109121
### OVER
122+
{: .anchor }
110123
Conceptually, the OVER metric modifies the KSI to quantify the difference between the two CDFs, but only where the CDFs differ by more than a critical limit $$ V_c $$. The OVER is calculated as:
111124

112125
$$ OVER = \int_{p_{\text{min}}}^{p_{\text{max}}} D_n^* dp $$
@@ -122,12 +135,14 @@ The OVER metric can be normalized using the same approach as for KSI.
122135

123136

124137
### Combined Performance Index (CPI)
138+
{: .anchor }
125139
The CPI can be thought of as a combination of KSI, OVER, and RMSE:
126140

127141
$$ \text{CPI} = \frac{1}{4} ( \text{KSI} + \text{OVER} + 2 \times \text{RMSE} ) $$
128142

129143

130144
## Metrics for Deterministic Event Forecasts
145+
{: .anchor }
131146
An event is defined by values that exceed or fall below a threshold. A typical event is the ramp in power of solar generation, which is determine by:
132147

133148
$$ | P(t + \Delta t) - P(t) | > \text{Ramp Forecasting Threshold} $$
@@ -145,36 +160,42 @@ By then counting the the number of TP, FP, TN and FN values, the following metri
145160

146161

147162
### Probability of Detection (POD)
163+
{: .anchor }
148164
The POD is the fraction of observed events correctly forecasted as events:
149165

150166
$$ POD = \frac{TP}{TP + FN} $$
151167

152168

153169
### False Alarm Ratio (FAR)
170+
{: .anchor }
154171
The FAR is the fraction of forecasted events that did not occur:
155172

156173
$$ FAR = \frac{FP}{TP + FP} $$
157174

158175

159176
### Probability of False Detection (POFD)
177+
{: .anchor }
160178
The POFD is the fraction of observed non-events that were forecasted as events:
161179

162180
$$ POFD = \frac{FP}{FP + TN} $$
163181

164182

165183
### Critical Success Index (CSI)
184+
{: .anchor }
166185
The CSI evaluates how well an event forecast predicts observed events, e.g., ramps in irradiance or power. THe CSI is the relative frequency of hits, i.e., how well predicted "yes" events correspond to observed "yes" events:
167186

168187
$$ CSI = \frac{TP}{TP + FP + FN} $$
169188

170189

171190
### Event Bias (EBIAS)
191+
{: .anchor }
172192
The EBIAS is the ratio of counts of forecast and observed events:
173193

174194
$$ EBIAS = \frac{TP + FP}{TP + FN} $$
175195

176196

177197
### Event Accuracy (EA)
198+
{: .anchor }
178199
The EA is the fraction of events that were forecasted correctly, i.e., forecast = "yes" and observed = "yes" or forecast = "no" and observed = "no":
179200

180201
$$ EA = \frac{TP + TN}{TP + FP + TN + FN} = \frac{TP + TN}{n} $$
@@ -183,6 +204,7 @@ where $$ n $$ is the number of samples.
183204

184205

185206
## Metrics for Probablistic Forecasts
207+
{: .anchor }
186208
Probablistic forecasts represent uncertainty in the forecast quantity by providing a probability distribution or a prediction interval, rather than a single value.
187209

188210
In the metrics below, we adopt the following nomenclature:
@@ -199,6 +221,7 @@ In the metrics below, we adopt the following nomenclature:
199221

200222

201223
### Brier Score (BS)
224+
{: .anchor }
202225
The BS measures the accuracy of forecast probability for one or more events:
203226

204227
$$ \text{BS} = \frac{1}{n} \sum_{i=1}^n (f_i - o_i)^2 $$
@@ -207,6 +230,7 @@ Smaller values of BS indicate better agreement between forecasts and observation
207230

208231

209232
### Brier Skill Score (BSS)
233+
{: .anchor }
210234
The BSS is based on the BS and measures the performance of a probability forecast relative to a reference forecast:
211235

212236
$$ BSS = 1 - \frac{\text{BS}_f}{\text{BS}_{\text{ref}}} $$
@@ -219,6 +243,7 @@ $$ \text{BS} = \text{REL} + \text{RES} + \text{UNC} $$
219243

220244

221245
### Reliability (REL)
246+
{: .anchor }
222247
The REL is given by:
223248

224249
$$ \text{REL} = \frac{1}{n} \sum_{i=1}^I N_i (f_i - \bar{o}_i)^2 $$
@@ -227,6 +252,7 @@ Reliability is the weighted averaged of the squared differences between the fore
227252

228253

229254
### Resolution (RES)
255+
{: .anchor }
230256
The RES is given by:
231257

232258
$$ \text{RES} = \frac{1}{n} \sum_{i=1}^I N_i (\bar{o}_i - \bar{o})^2 $$
@@ -235,6 +261,7 @@ Resolution is the weighted averaged of the squared differences between the relea
235261

236262

237263
### Uncertainty (UNC)
264+
{: .anchor }
238265
The UNC is given by:
239266

240267
$$ \text{UNC} = \bar{o} (1 - \bar{o})$$
@@ -243,6 +270,7 @@ Uncertainty is the variance of the event indicator $$ o(t) $$. Low values of UNC
243270

244271

245272
### Sharpness (SH)
273+
{: .anchor }
246274
The SH represents the degree of "concentration" of a forecast comprising a prediction interval of the form $$ [ f_l, f_u ] $$ within which the forecast quantity is expected to fall with probability $$ 1 - \beta $$. A good forecast should have a low sharpness value. The prediction interval endpoints are associated with quantiles $$ \alpha_l $$ and $$ \alpha_u $$, where $$ \alpha_u - \alpha_l = 1 - \beta $$. For a single prediction interval, the SH is:
247275

248276
$$ \text{SH} = f_u - f_l $$
@@ -253,6 +281,7 @@ $$ \text{SH} = \frac{1}{n} \sum_{i=1}^n f_{u,i} - f_{l, i} $$
253281

254282

255283
### Continuous Ranked Probability Score (CRPS)
284+
{: .anchor }
256285
The CRPS is a score that is a designed to measure both the realiability and sharpness of a probablistic forecast. For a timeseries of forecasts comprising a CDF at each time point, the CRPS is:
257286

258287
$$ \text{CRPS} = \frac{1}{n} \sum_{i=1}^n \int | F_i(x) - O_i(x) | dx $$

0 commit comments

Comments
 (0)