This repository was archived by the owner on Feb 27, 2020. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathpkg-functions.Rmd
359 lines (304 loc) · 11.7 KB
/
pkg-functions.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
---
title: "Create your own functions for package development in R"
params:
datetime: "2018-11-23"
level: "Intermediate" # or intermediate or advanced
instructor: "Luke Johnston"
---
```{r setup, include=FALSE}
source("R/utils.R")
```
## Session details
- **Date of session:** `r format_date(params$datetime)`
- **Instructor:** `r params$instructor`
- **Session level:** `r params$level`
- **Part of the ["Package Development Series"](index.html)**
- **Required packages to install**:
```{r, eval=FALSE}
install.packages(c("devtools", "usethis", "roxygen2"))
```
`r add_video(params$video_id)`
### Objectives
1. To learn how functions are structured and created.
1. To learn and apply the typical workflow for creating functions in packages
(and in general).
1. To properly document and name functions so you don't forget what they do!
`r emo::ji("winking_face_with_tongue")`
**At the end of this session you will be able:**
- Create a function with several arguments.
- Add and fill out roxygen documentation to the function.
- Using the "Code"->"Insert Roxygen Skeleton" (`Ctrl-Shift-Alt-R`)
- Then creating the documentation using `devtools::
- Cycle between testing the function out in the `@example` section and developing
the function.
- Using `devtools::load_all()` (`Ctrl-Shift-L`).
- Run the code in the `@example` by typing `Ctrl-Enter`.
- Repeat until the function does what you want!
**Ultimately**, I hope you will try to create a function from your own code by
the end of this session!
## Resources for learning and help
- Generally, the [R Packages](http://r-pkgs.had.co.nz/) book is great for
anything related to making packages.
- [`R/` chapter in R Packages](http://r-pkgs.had.co.nz/r.html) book (online and
free).
- [Documenting functions chapter](http://r-pkgs.had.co.nz/man.html) in R
Packages book.
- [Functions chapter in R for Data Science](https://r4ds.had.co.nz/functions.html)
book, also online and free.
- [Intro to functions](https://swcarpentry.github.io/r-novice-inflammation/02-func-R/)
from [Software Carpentry](https://software-carpentry.org/).
- [Functions in R](https://www.datacamp.com/courses/writing-functions-in-r)
DataCamp course (need subscription).
- [STAT545](http://stat545.com/block011_write-your-own-function-01.html)
on writing functions from University of British Columbia, Canada.
- Very advanced (for those interested) details about functions from [Advanced
R](http://adv-r.had.co.nz/Functions.html) book (again, online and free).
## Initial setup
First we need to create a package layout and create an R script. See the
[Package Creation](pkg-creation.html) session for more details on this.
1. Create a package using `usethis::create_package("<pkgname>")`
1. Add an R script using `usethis::use_r("<scriptname>.R")`.
## All actions in R are functions
The `+` is a function, `mean()` is a function, `[]` is a function... everything
that does something is called a function in R. So to add 1 with 1:
```{r}
1 + 1
```
... is a function that takes 1 and adds 1 to it. It is actually a short form for:
```{r}
`+`(1, 1)
```
When creating a function, there is always the basic structure of:
1. Name of the function (e.g. `mean`).
2. The function call using `function()` being assigned `<-` to the name. This
tells R that the name is a function object... that it does some action.
3. The arguments within the function call, `function(arg1, arg2, arg3)`. These
are the options given to the function (e.g. `sum(arg1, arg2)`).
4. The body of the function, that takes the arguments, if any, does some action,
and finishes by outputting some result or object at the end (using `return()`).
There is no minimum or maximum number of arguments you can provide for a function.
E.g. you can have zero arguments or you can have 100s. To keep things sane, try
to keep the number of arguments low, like not more than 4 or 5.
So, the structure is:
```{r, eval=FALSE}
name <- function(arg1, arg2) {
# body of function
return(output)
}
```
... and an example:
```{r}
add_nums <- function(num1, num2) {
added <- num1 + num2
return(added)
}
```
You can use the new function by running the above code and writing out your new
function, with arguments to give it.
```{r}
add_nums(1, 2)
```
The function name is fairly good... `add_nums` can be read as "add numbers". But
we need to also add some formal documentation to the function. Using the "Insert
Roxygen Skeleton" in the "Code" menu list (or by typing `Ctrl-Shift-Alt-R`) you
can add template documentation right above the function. It looks like:
```{r}
#' Title
#'
#' @param num1
#' @param num2
#'
#' @return
#' @export
#'
#' @examples
add_nums <- function(num1, num2) {
added <- num1 + num2
return(added)
}
```
In the `Title` area, this is where you type out a brief sentence or several words
that describe the function. Creating a new paragraph below this line allows you
to add a more detailed description. The other items are:
- `@param num` lines are to describe what each argument is for.
- `@return` describes what output the function provides. Is it a data.frame? A plot?
What else does the output give?
- `@export` this is to tell R that this function should be accessible to the user
of your package. Keep it in for now.
- `@examples` lines below this are used to show examples of how to use the function.
This is also the area where you write and test that the function does what you want.
You can run code here as if it was regular R code and not a commented out code by
using `Ctrl-Enter`.
```{r}
#' Add two numbers together.
#'
#' This is just an example function to show how to create one.
#'
#' @param num1 A number here.
#' @param num2 A number here.
#'
#' @return Returns the sum of the two numbers.
#' @export
#'
#' @examples
#'
#' add_nums(1, 1)
add_nums <- function(num1, num2) {
added <- num1 + num2
return(added)
}
```
Now, when we run `devtools::document()` (or `Ctrl-Shift-D`), a file will be added
to the `man/` folder. Now, when you type out `?add_nums` in the console, the help
documentation will pop up on the "Help" tab.
Ok, let's get to something a bit more interesting. A common thing that people do
(at least I do) is to create a similar plot on different variables and datasets.
So this is a great example of using a function to simplify your code. We can also
cover... package/function dependencies! Since we will use ggplot2 to make the plot,
we need some way to tell R that the functions come from ggplot2... It's **very bad
practice** to use `library()` in your function and in your package. A function from
usethis comes to the rescue! Use `usethis::use_package("ggplot2")` in the
console. Some text will appear saying that "ggplot2 has been added to DESCRIPTION"
and that "please refer to functions using ggplot2::fun". So let's do that!
```{r}
scatter_plot <- function(data, xvar, yvar) {
graph <- ggplot2::ggplot(data = data, ggplot2::aes(x = xvar, y = yvar)) +
ggplot2::geom_point()
return(graph)
}
```
Ok, we have the base for making a scatter plot. But! There are few things to talk
about here first. First, ggplot2 will be confused by the `x = xvar` since it will
think[^nse] you are asking for the `xvar` column in the dataset. So, we need to change
`aes` to `aes_string` to force ggplot2 to read `xvar` as a character string that
is the name of the column you want to plot. Next, it is useful in many cases to
put a dot before your function arguments to differentiate your function arguments
from other R objects. Let's also add the Roxygen documentation. So:
[^nse]: The reason it won't work is because ggplot2 is using what is called
non-standard evaluation (NSE; check out
[here](https://cran.r-project.org/web/packages/lazyeval/vignettes/lazyeval.html)
or [here](https://dplyr.tidyverse.org/articles/programming.html) for more
indepth look at what non-standard evaluation is). Because ggplot2 uses NSE, you
will have to do things slightly differently, hence using `aes_string`.
```{r}
#' Create a scatter plot of two variables.
#'
#' @param .data The dataset.
#' @param .xvar The x-axis variable.
#' @param .yvar The y-axis variable.
#'
#' @return A scatter plot.
#' @export
#'
#' @examples
#'
#' scatter_plot(swiss, "Education", "Agriculture")
scatter_plot <- function(.data, .xvar, .yvar) {
graph <- ggplot2::ggplot(data = .data, ggplot2::aes_string(x = .xvar, y = .yvar)) +
ggplot2::geom_point()
return(graph)
}
```
```{r, echo=FALSE}
scatter_plot(swiss, "Education", "Agriculture")
```
Now we do `devtools::load_all()` (`Ctrl-Shift-L`) and run the code in the
example (`Ctrl-Enter`). A quick note: Running code in the `@examples` section *only*
works if you are in an R package project (the RStudio `.Rproj` file). If you are
not in an R package, you can instead include the code to run the function below
your new function, like so:
```{r, eval=FALSE}
#' Create a scatter plot of two variables.
#'
#' @param .data The dataset.
#' @param .xvar The x-axis variable.
#' @param .yvar The y-axis variable.
#'
#' @return A scatter plot.
#' @export
#'
#' @examples
#'
scatter_plot <- function(.data, .xvar, .yvar) {
graph <- ggplot2::ggplot(data = .data, ggplot2::aes_string(x = .xvar, y = .yvar)) +
ggplot2::geom_point()
return(graph)
}
scatter_plot(swiss, "Education", "Agriculture")
```
Now, if we want to add some theme items, all graphs created from this function
will get the new theme and appearance!
```{r}
#' Create a scatter plot of two variables.
#'
#' @param .data The dataset.
#' @param .xvar The x-axis variable.
#' @param .yvar The y-axis variable.
#'
#' @return A scatter plot.
#' @export
#'
#' @examples
#'
#' scatter_plot(swiss, "Education", "Agriculture")
scatter_plot <- function(.data, .xvar, .yvar) {
graph <- ggplot2::ggplot(data = .data, ggplot2::aes_string(x = .xvar, y = .yvar)) +
ggplot2::geom_point() +
ggplot2::theme_minimal()
return(graph)
}
```
```{r, echo=FALSE}
scatter_plot(swiss, "Education", "Agriculture")
```
If you want to make sure that who ever uses your function will not use a wrong
argument, you can use "defensive programming" via the `stopifnot()` function.
This forces the code to only work if `xvar` and `yvar` are character (e.g. `"this"`)
argument.
```{r}
#' Create a scatter plot of two variables.
#'
#' @param .data The dataset.
#' @param .xvar The x-axis variable.
#' @param .yvar The y-axis variable.
#'
#' @return A scatter plot.
#' @export
#'
#' @examples
#'
#' scatter_plot(swiss, "Education", "Agriculture")
scatter_plot <- function(.data, .xvar, .yvar) {
stopifnot(is.character(.xvar), is.character(.yvar))
graph <- ggplot2::ggplot(data = .data, ggplot2::aes_string(x = .xvar, y = .yvar)) +
ggplot2::geom_point() +
ggplot2::theme_minimal()
return(graph)
}
```
## Exercise: Make your own function!
Using this workflow, try to create, document, and test your own function! If you
have some code that you already use repeatedly by copy and pasting, try to convert
that code into a function. If you don't have your own code, try converting these
pieces of code into their own function:
```{r, eval=FALSE}
# Code chunk 1:
one_species <- iris[iris['Species'] == "setosa", ]
sepal_one_species <- one_species[c("Sepal.Length", "Sepal.Width")]
means_sepal_one_species <- sapply(sepal_one_species, mean)
means_sepal_one_species
```
## Other notes
There are a few other things to consider. In R there are different "methods" of
functions. This is way above what is necessary for this session, but if you are
curious [this website](http://adv-r.had.co.nz/S3.html) has a great explanation
of the different methods (e.g. S3 methods). Be warned, the website is fairly
advanced!
You can always look at the contents of all functions in R. So an example of an
S3 function:
```{r}
# Generic S3
print
# Printing for data.frames
print.data.frame
```