Skip to content

Conversation

@seabbs
Copy link
Contributor

@seabbs seabbs commented Jan 12, 2023

This PR adds support for scoring quantile forecasts that have a sample column. It closes #242. Whilst I've added some testing to protect against issues it may be the case that the "protected" column assumptions are baked into places I have missed and so this is still dangerous.

See the following example for the new functionality:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)
library(scoringutils)

n_sim <- 1000
epsilon <- rnorm(n_sim)
Y <- exp(epsilon)

forecasts <- expand.grid(
  sigma = 1:20/10, 
  quantile = c(0.01, 0.025, 1:19/20, 0.975, 0.99)
)

forecasts <- forecasts |>
  as_tibble() |>
  mutate(model = 10 * sigma,
         prediction = exp(qnorm(quantile, sd = sigma)),
         true_value = list(Y),
         sample = list(1:length(Y))) |>
  unnest(c(true_value, sample))

check_forecasts(forecasts)
#> Your forecasts seem to be for a target of the following type:
#> $target_type
#> [1] "continuous"
#> 
#> and in the following format:
#> $prediction_type
#> [1] "quantile"
#> 
#> The unit of a single forecast is defined by:
#> $forecast_unit
#> [1] "sigma"  "model"  "sample"
#> 
#> Cleaned data, rows with NA values in prediction or true_value removed:
#> $cleaned_data
#>         sigma quantile model  prediction true_value sample
#>         <num>    <num> <num>       <num>      <num>  <int>
#>      1:   0.1     0.01     1   0.7924429  0.8022296      1
#>      2:   0.1     0.01     1   0.7924429  0.4194004      2
#>      3:   0.1     0.01     1   0.7924429  0.7896071      3
#>      4:   0.1     0.01     1   0.7924429  0.5963944      4
#>      5:   0.1     0.01     1   0.7924429  0.6159169      5
#>     ---                                                   
#> 459996:   2.0     0.99    20 104.8673007  0.6357381    996
#> 459997:   2.0     0.99    20 104.8673007  0.3471260    997
#> 459998:   2.0     0.99    20 104.8673007  0.1470999    998
#> 459999:   2.0     0.99    20 104.8673007  0.3589190    999
#> 460000:   2.0     0.99    20 104.8673007  1.2826372   1000
#> 
#> Number of unique values per column per model:
#> $unique_values
#>     model sigma quantile prediction true_value sample
#>     <num> <int>    <int>      <int>      <int>  <int>
#>  1:     1     1       23         23       1000   1000
#>  2:     2     1       23         23       1000   1000
#>  3:     3     1       23         23       1000   1000
#>  4:     4     1       23         23       1000   1000
#>  5:     5     1       23         23       1000   1000
#>  6:     6     1       23         23       1000   1000
#>  7:     7     1       23         23       1000   1000
#>  8:     8     1       23         23       1000   1000
#>  9:     9     1       23         23       1000   1000
#> 10:    10     1       23         23       1000   1000
#> 11:    11     1       23         23       1000   1000
#> 12:    12     1       23         23       1000   1000
#> 13:    13     1       23         23       1000   1000
#> 14:    14     1       23         23       1000   1000
#> 15:    15     1       23         23       1000   1000
#> 16:    16     1       23         23       1000   1000
#> 17:    17     1       23         23       1000   1000
#> 18:    18     1       23         23       1000   1000
#> 19:    19     1       23         23       1000   1000
#> 20:    20     1       23         23       1000   1000
#>     model sigma quantile prediction true_value sample
scores <- score(forecasts)
summarise_scores(scores, by = "sample")
#>       sample interval_score dispersion underprediction overprediction
#>        <int>          <num>      <num>           <num>          <num>
#>    1:      1      0.3549503  0.3302979      0.00000000     0.02465237
#>    2:      2      0.5136531  0.3302979      0.00000000     0.18335521
#>    3:      3      0.3580064  0.3302979      0.00000000     0.02770844
#>    4:      4      0.4237402  0.3302979      0.00000000     0.09344226
#>    5:      5      0.4155050  0.3302979      0.00000000     0.08520709
#>   ---                                                                
#>  996:    996      0.4075680  0.3302979      0.00000000     0.07727007
#>  997:    997      0.5581558  0.3302979      0.00000000     0.22785787
#>  998:    998      0.7055681  0.3302979      0.00000000     0.37527021
#>  999:    999      0.5505211  0.3302979      0.00000000     0.22022319
#> 1000:   1000      0.3718230  0.3302979      0.04152506     0.00000000
#>       coverage_deviation    bias ae_median
#>                    <num>   <num>     <num>
#>    1:        0.206086957  0.3140 0.1977704
#>    2:       -0.141739130  0.6815 0.5805996
#>    3:        0.188695652  0.3300 0.2103929
#>    4:        0.006086957  0.5275 0.4036056
#>    5:        0.027826087  0.5050 0.3840831
#>   ---                                     
#>  996:        0.053913043  0.4790 0.3642619
#>  997:       -0.211304348  0.7515 0.6528740
#>  998:       -0.385217391  0.9090 0.8529001
#>  999:       -0.198260870  0.7365 0.6410810
#> 1000:        0.184347826 -0.3350 0.2826372

Created on 2023-01-12 with reprex v2.0.2

@codecov
Copy link

codecov bot commented Jan 12, 2023

Codecov Report

Merging #261 (e1c2090) into master (43b3394) will increase coverage by 0.03%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #261      +/-   ##
==========================================
+ Coverage   91.36%   91.39%   +0.03%     
==========================================
  Files          21       21              
  Lines        1366     1371       +5     
==========================================
+ Hits         1248     1253       +5     
  Misses        118      118              
Impacted Files Coverage Δ
R/check_forecasts.R 87.50% <100.00%> (+0.11%) ⬆️
R/summarise_scores.R 89.74% <100.00%> (+0.13%) ⬆️
R/utils.R 88.88% <100.00%> (+0.65%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@seabbs seabbs requested a review from nikosbosse January 12, 2023 22:17
@seabbs seabbs added the enhancement New feature or request label Jan 12, 2023
@seabbs seabbs marked this pull request as ready for review January 12, 2023 22:18
@nikosbosse
Copy link
Collaborator

Nice thanks a lot! So essentially this internally checks whether the prediction type is quantile, and if it is then it removes "sample" from the list of protected columns, right?

What do you think about the additional (alternative?) feature that it would give a message / warning when you run check_forecasts() and have a protected column there?

@seabbs
Copy link
Contributor Author

seabbs commented Jan 13, 2023

So essentially this internally checks whether the prediction type is quantile, and if it is then it removes "sample" from the list of protected columns, right?

Yes exactly

What do you think about the additional (alternative?) feature that it would give a message / warning when you run check_forecasts() and have a protected column there?

I am not sure why you would want to do that? Unless it offers safety elsewhere in your code it seems overly restrictive.

I'm totally open to either so can either merge this in or close out and flag the desired implementation in the original issue.

@seabbs seabbs requested a review from nikosbosse January 13, 2023 19:28
@nikosbosse
Copy link
Collaborator

Merci!

@nikosbosse nikosbosse merged commit 55d184c into master Jan 16, 2023
@nikosbosse nikosbosse deleted the seabbs/issue242 branch January 16, 2023 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

There should be a warning when there is a column called "sample" with a quantile format

2 participants