Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions and plot #59

Merged
merged 18 commits into from
Dec 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions R/cont_add_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,10 @@
#' to build the column names is taken from the metadata available in the
#' `elic_cont` object.
#'
#' `var_conf`, given in percents, can be any number between 60 and 100. Any
#' value under 50 would imply that the accuracy of the estimates is only due to
#' chance).
#'
#' @section Data cleaning:
#'
#' When data are added to the `elic_cont` object, first names are standardised
Expand Down
17 changes: 17 additions & 0 deletions R/cont_plot.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,23 @@
#' theme.
#' @inheritParams elic_cont_add_data
#'
#' @section scale_conf:
#'
#' If the variable plotted is the result of a four points elicitation where
#' expert confidence is provided, the minimum and maximum values provided by
#' each expert are rescaled using their provided confidence levels. Users can
#' choose how they want to rescale minimum and maximum values by providing a
#' value for the `scale_conf` argument. If no argument is provided, a default
#' value of 100 is used for scale_conf.
#'
#' The scaled minimum and maximum values are obtained with:
#'
#' \eqn{minimum = best\ guess - (best\ guess - minimum)\frac{scale\_conf}
#' {confidence}}
#'
#' \eqn{maximum = best\ guess + (maximum - best\ guess) \frac{scale\_conf}
#' {confidence}}
#'
#' @details
#' The `truth` argument is useful when the elicitation process is part of a
#' workshop and is used for demonstration. In this case the true value is known
Expand Down
96 changes: 95 additions & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,102 @@ You can install the development version of elicitr from GitHub with:
pak::pak("CREWdecisions/elicitr")
```

### Getting started

```{r}
library(elicitr)
```

All the functions in the elicitr package start with the prefix `elic_`. After that, two prefixes are available: `elic_cont` and `elic_cat`. This design choice is intended to enhance functions discovery.\
`elic_cont` functions are used for the elicitation of continuous variables while `elic_cat` functions for the elicitation of categorical variables.

#### How elicitr works

Just like you create a form to collect estimates in an elicitation process, the core of elicitr is the creation of an object to store the metadata information. This allows to check whether experts have given their answers in the expected way.\
Any analysis starts by creating this object with the `start` function. Then, data can be added and retrieved using the `add_data` and `get_data` functions respectively. Finally data can be plotted using the `plot` function. Details about the implementation and example usages of these functions can be seen bellow.

### Elicitation of continuous variables

#### Simulated datasets

Two simulated datasets are included in elicitr. These datasets are intended to demonstrate the functionality of the package and do not represent an actual elicitation process (names are also randomly generated).

```{r}
round_1
```
###
```{r}
round_2
```

#### Functions

Any analysis of continuous variables starts by creating the `elic_cont` object with the function `elic_cont_start()` to store the metadata of the elicitation.To build this `elic_cont` object, four parameters must be specified:

* `var` the number of variables (i.e. the number of topics in your elicitation)
* `var_types` the type of variables for each of these variables (many options are available, ranging from real numbers to probabilities)
* `elic_types` the type of elicitation for each of these variables (three options are available: one, three, and four points elicitations)
* `experts` the number of experts that replied to the elicitation

```{r}
my_elicitation <- elic_cont_start(var = c("var1", "var2", "var3"),
var_types = "ZNp",
elic_types = "134",
experts = 6,
title = "Elicitation example")
```

```{r}
my_elicitation
```

Once the metadata has been added to the `elic_cont` object, the data of the first round of elicitation can be added with the function `elic_cont_add_data()`:

```{r}
my_elicitation <- elic_cont_add_data(my_elicitation,
data_source = round_1,
round = 1)
```

The information message confirms that the data for the first round has been added to the `elic_cont` object from a `data.frame`. Besides `data frames`, elicitr also allows users to add data from `.csv` or `.xlsx` files, and from Google Sheets.

If you conducted a second round of elicitation, it can be added to the `elic_cont` object after the first round has been added:

```{r}
my_elicitation <- elic_cont_add_data(my_elicitation,
data_source = round_2,
round = 2)
```

To keep the anonymity of experts, their names are converted to short sha1 hashes and saved in the `id` column. These are then used to match the expert’s answers in the two rounds.

The function `elic_cont_get_data()` retrieves data from an `elicit` object. It is possible to get the whole dataset of a given round, or extract only the data for a given variable, variable type, or elicitation type:

```{r}
elic_cont_get_data(my_elicitation,
round = 1,
var = "all")
```

Finally, data can be plotted using the function `elic_cont_plot()`. This function plots data belonging to a given round and for a given variable.

```{r cont_plot}
elic_cont_plot(my_elicitation,
round = 2,
group = TRUE,
var = "var3",
xlab = "Variable 3")
```

Variable 3 (the plotted variable) is the result of a four points elicitation, where minimum and maximum estimates, best guess, and expert confidence is provided. In the plot, the best guess is represented with a dot, and the range between minimum and maximum estimates is represented with a line. Expert estimates are represented in purple, while the group's mean is represented in orange.

The message printed when the function is ran informs users that the minimum and maximum value given by experts have been rescaled using their provided confidence level.

### Elicitation of categorical variables

#### In development

### Similar packages

* {shelf} : Oakley, J. (2024). Package “SHELF” Tools to Support the Sheffield Elicitation Framework. https://doi.org/10.32614/CRAN.package.SHELF
* {prefR} : Lepird, J. (2022). Package “prefeR” R Package for Pairwise Preference Elicitation. https://doi.org/10.32614/CRAN.package.prefeR

181 changes: 181 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,187 @@ You can install the development version of elicitr from GitHub with:
pak::pak("CREWdecisions/elicitr")
```

### Getting started

``` r
library(elicitr)
```

All the functions in the elicitr package start with the prefix `elic_`.
After that, two prefixes are available: `elic_cont` and `elic_cat`. This
design choice is intended to enhance functions discovery.
`elic_cont` functions are used for the elicitation of continuous
variables while `elic_cat` functions for the elicitation of categorical
variables.

#### How elicitr works

Just like you create a form to collect estimates in an elicitation
process, the core of elicitr is the creation of an object to store the
metadata information. This allows to check whether experts have given
their answers in the expected way.
Any analysis starts by creating this object with the `start` function.
Then, data can be added and retrieved using the `add_data` and
`get_data` functions respectively. Finally data can be plotted using the
`plot` function. Details about the implementation and example usages of
these functions can be seen bellow.

### Elicitation of continuous variables

#### Simulated datasets

Two simulated datasets are included in elicitr. These datasets are
intended to demonstrate the functionality of the package and do not
represent an actual elicitation process (names are also randomly
generated).

``` r
round_1
#> # A tibble: 6 × 9
#> name var1_best var2_min var2_max var2_best var3_min var3_max var3_best
#> <chr> <int> <int> <int> <int> <dbl> <dbl> <dbl>
#> 1 Derek Macle… 1 20 24 22 0.43 0.83 0.73
#> 2 Christopher… 0 7 10 9 0.67 0.87 0.77
#> 3 Mar'Quasa B… 0 10 15 12 0.65 0.95 0.85
#> 4 Mastoora al… -7 4 12 9 0.44 0.84 0.64
#> 5 Eriberto Mu… -5 13 18 16 0.38 0.88 0.68
#> 6 Paul Bol 3 20 26 25 0.35 0.85 0.65
#> # ℹ 1 more variable: var3_conf <int>
```

###

``` r
round_2
#> # A tibble: 6 × 9
#> name var1_best var2_min var2_max var2_best var3_min var3_max var3_best
#> <chr> <int> <int> <int> <int> <dbl> <dbl> <dbl>
#> 1 Mar'Quasa B… -2 15 21 18 0.62 0.82 0.72
#> 2 Mastoora al… -4 11 15 12 0.52 0.82 0.72
#> 3 Eriberto Mu… 1 15 20 17 0.58 0.78 0.68
#> 4 Derek Macle… 0 11 18 15 0.52 0.82 0.72
#> 5 Christopher… -2 14 18 15 0.55 0.85 0.75
#> 6 Paul Bol 1 18 23 20 0.66 0.86 0.76
#> # ℹ 1 more variable: var3_conf <int>
```

#### Functions

Any analysis of continuous variables starts by creating the `elic_cont`
object with the function `elic_cont_start()` to store the metadata of
the elicitation.To build this `elic_cont` object, four parameters must
be specified:

- `var` the number of variables (i.e. the number of topics in your
elicitation)
- `var_types` the type of variables for each of these variables (many
options are available, ranging from real numbers to probabilities)
- `elic_types` the type of elicitation for each of these variables
(three options are available: one, three, and four points
elicitations)
- `experts` the number of experts that replied to the elicitation

``` r
my_elicitation <- elic_cont_start(var = c("var1", "var2", "var3"),
var_types = "ZNp",
elic_types = "134",
experts = 6,
title = "Elicitation example")
#> ✔ <elic_cont> object for "Elicitation example" correctly initialised
```

``` r
my_elicitation
#>
#> ── Elicitation example ──
#>
#> • Variables: "var1", "var2", and "var3"
#> • Variable types: "Z", "N", and "p"
#> • Elicitation types: "1p", "3p", and "4p"
#> • Number of experts: 6
#> • Number of rounds: 0
```

Once the metadata has been added to the `elic_cont` object, the data of
the first round of elicitation can be added with the function
`elic_cont_add_data()`:

``` r
my_elicitation <- elic_cont_add_data(my_elicitation,
data_source = round_1,
round = 1)
#> ✔ Data added to "Round 1" from "data.frame"
```

The information message confirms that the data for the first round has
been added to the `elic_cont` object from a `data.frame`. Besides
`data frames`, elicitr also allows users to add data from `.csv` or
`.xlsx` files, and from Google Sheets.

If you conducted a second round of elicitation, it can be added to the
`elic_cont` object after the first round has been added:

``` r
my_elicitation <- elic_cont_add_data(my_elicitation,
data_source = round_2,
round = 2)
#> ✔ Data added to "Round 2" from "data.frame"
```

To keep the anonymity of experts, their names are converted to short
sha1 hashes and saved in the `id` column. These are then used to match
the expert’s answers in the two rounds.

The function `elic_cont_get_data()` retrieves data from an `elicit`
object. It is possible to get the whole dataset of a given round, or
extract only the data for a given variable, variable type, or
elicitation type:

``` r
elic_cont_get_data(my_elicitation,
round = 1,
var = "all")
#> # A tibble: 6 × 9
#> id var1_best var2_min var2_max var2_best var3_min var3_max var3_best
#> <chr> <int> <int> <int> <int> <dbl> <dbl> <dbl>
#> 1 5ac97e0 1 20 24 22 0.43 0.83 0.73
#> 2 e51202e 0 7 10 9 0.67 0.87 0.77
#> 3 e78cbf4 0 10 15 12 0.65 0.95 0.85
#> 4 9fafbee -7 4 12 9 0.44 0.84 0.64
#> 5 3cc9c29 -5 13 18 16 0.38 0.88 0.68
#> 6 3d32ab9 3 20 26 25 0.35 0.85 0.65
#> # ℹ 1 more variable: var3_conf <int>
```

Finally, data can be plotted using the function `elic_cont_plot()`. This
function plots data belonging to a given round and for a given variable.

``` r
elic_cont_plot(my_elicitation,
round = 2,
group = TRUE,
var = "var3",
xlab = "Variable 3")
#> ✔ Rescaled min and max
```

<img src="man/figures/README-cont_plot-1.png" width="100%" />

Variable 3 (the plotted variable) is the result of a four points
elicitation, where minimum and maximum estimates, best guess, and expert
confidence is provided. In the plot, the best guess is represented with
a dot, and the range between minimum and maximum estimates is
represented with a line. Expert estimates are represented in purple,
while the group’s mean is represented in orange.

The message printed when the function is ran informs users that the
minimum and maximum value given by experts have been rescaled using
their provided confidence level.

### Elicitation of categorical variables

#### In development

### Similar packages

- {shelf} : Oakley, J. (2024). Package “SHELF” Tools to Support the
Expand Down
4 changes: 4 additions & 0 deletions man/elic_cont_add_data.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 19 additions & 0 deletions man/elic_cont_plot.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Binary file added man/figures/README-cont_plot-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading