Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ A minor update to the package with some bug fixes and minor changes.
- Removed the on attach message which warned of breaking changes in `1.0.0`.
- Renamed the `metric` argument of `summarise_scores()` to `relative_skill_metric`. This argument is now deprecated and will be removed in a future version of the package. Please use the new argument instead.
- Updated the documentation for `score()` and related functions to make the soft requirement for a `model` column in the input data more explicit.
- Updated the documentation for `score()`, `pairwise_comparison()` and `summaris_scores()` to make it clearer what the unit of a single forecast is that is required for computations
- Simplified the function `plot_pairwise_comparison()` which now only supports plotting mean score ratios or p-values and removed the hybrid option to print both at the same time.

## Bug fixes
Expand Down
1 change: 1 addition & 0 deletions R/check_forecasts.R
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,7 @@ print.scoringutils_check <- function(x, ...) {
#'
#' @param forecast_unit A character vector with the column names that define
#' the unit of a single forecast. If missing the function tries to infer the
#' unit of a single forecast.
#'
#' @param ... Additional arguments passed to [get_forecast_unit()].
#' @return A data.frame with all rows for which a duplicate forecast was found
Expand Down
21 changes: 19 additions & 2 deletions R/pairwise-comparisons.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,26 @@
#'
#' @description
#'
#' Make pairwise comparisons between models. The code for the pairwise
#' comparisons is inspired by an implementation by Johannes Bracher.
#' Compute relative scores between different models making pairwise
#' comparisons. Pairwise comparisons are a sort of pairwise tournament where all
#' combinations of two models are compared against each other based on the
#' overlapping set of available forecasts common to both models.
#' Internally, a ratio of the mean scores of both models is computed.
#' The relative score of a model is then the geometric mean of all mean score
#' ratios which involve that model. When a baseline is provided, then that
#' baseline is excluded from the relative scores for individual models
#' (which therefore differ slightly from relative scores without a baseline)
#' and all relative scores are scaled by (i.e. divided by) the relative score of
#' the baseline model.
#' Usually, the function input should be unsummarised scores as
#' produced by [score()].
#' Note that the function internally infers the *unit of a single forecast* by
#' determining all columns in the input that do not correspond to metrics
#' computed by [score()]. Adding unrelated columns will change results in an
#' unpredictable way.
#'
#' The code for the pairwise comparisons is inspired by an implementation by
#' Johannes Bracher.
#' The implementation of the permutation test follows the function
#' `permutationTest` from the `surveillance` package by Michael Höhle,
#' Andrea Riebler and Michaela Paul.
Expand Down
15 changes: 14 additions & 1 deletion R/score.R
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,26 @@
#' each format are also provided (see the documentation for `data` below or in
#' [check_forecasts()]).
#'
#' To obtain a quick overview of the currrently supported evaluation metrics,
#' Each format has a set of required columns (see below). Additional columns may
#' be present to indicate a grouping of forecasts. For example, we could have
#' forecasts made by different models in various locations at different time
#' points, each for several weeks into the future. It is important, that there
#' are only columns present which are relevant in order to group forecasts.
#' A combination of different columns should uniquely define the
#' *unit of a single forecast*, meaning that a single forecast is defined by the
#' values in the other columns. Adding additional unrelated columns may alter
#' results.
#'
#' To obtain a quick overview of the currently supported evaluation metrics,
#' have a look at the [metrics] data included in the package. The column
#' `metrics$Name` gives an overview of all available metric names that can be
#' computed. If interested in an unsupported metric please open a [feature
#' request](https://github.com/epiforecasts/scoringutils/issues) or consider
#' contributing a pull request.
#'
#' For additional help and examples, check out the [Getting Started
#' Vignette](https://epiforecasts.io/scoringutils/articles/getting-started.html).
#'
#' @param data A data.frame or data.table with the predictions and observations.
#' For scoring using [score()], the following columns need to be present:
#' \itemize{
Expand Down
14 changes: 11 additions & 3 deletions R/summarise_scores.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,14 @@
#' @inheritParams score
#' @param by character vector with column names to summarise scores by. Default
#' is `NULL`, meaning that the only summary that takes is place is summarising
#' over quantiles (in case of quantile-based forecasts), such that there is one
#' score per forecast as defined by the unit of a single forecast (rather than
#' one score for every quantile).
#' over samples or quantiles (in case of quantile-based forecasts), such that
#' there is one score per forecast as defined by the *unit of a single forecast*
#' (rather than one score for every sample or quantile).
#' The *unit of a single forecast* is determined by the columns present in the
#' input data that do not correspond to a metric produced by [score()], which
#' indicate indicate a grouping of forecasts (for example there may be one
#' forecast per day, location and model). Adding additional, unrelated, columns
#' may alter results in an unpredictable way.
#' @param fun a function used for summarising scores. Default is `mean`.
#' @param relative_skill logical, whether or not to compute relative
#' performance between models based on pairwise comparisons.
Expand All @@ -33,6 +38,9 @@
#' @examples
#' library(magrittr) # pipe operator
#'
#' scores <- score(example_continuous)
#' summarise_scores(scores)
#'
#' # summarise over samples or quantiles to get one score per forecast
#' scores <- score(example_quantile)
#' summarise_scores(scores)
Expand Down
11 changes: 8 additions & 3 deletions man/check_summary_params.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/find_duplicates.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 19 additions & 2 deletions man/pairwise_comparison.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 13 additions & 1 deletion man/score.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

16 changes: 12 additions & 4 deletions man/summarise_scores.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.