Skip to content

Commit fdde458

Browse files
committed
Add ref to @AmeliaMN paper
1 parent 714063c commit fdde458

File tree

3 files changed

+5
-6
lines changed

3 files changed

+5
-6
lines changed

README.Rmd

+1-1
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ knitr::opts_chunk$set(
2020

2121
## Overview
2222

23-
R uses __factors__ to handle categorical variables, variables that have a fixed and known set of possible values. Historically, factors were much easier to work with than character vectors, so many base R functions automatically convert character vectors to factors. (For more historical context, I recommend [_stringsAsFactors: An unauthorized biography_](http://simplystatistics.org/2015/07/24/stringsasfactors-an-unauthorized-biography/) by Roger Peng, and [_stringsAsFactors = \<sigh\>_](http://notstatschat.tumblr.com/post/124987394001/stringsasfactors-sigh) by Thomas Lumley.) These days, making factors automatically is no longer so helpful, so packages in the [tidyverse](http://tidyverse.org) never create them automatically.
23+
R uses __factors__ to handle categorical variables, variables that have a fixed and known set of possible values. Historically, factors were much easier to work with than character vectors, so many base R functions automatically convert character vectors to factors. (For historical context, I recommend [_stringsAsFactors: An unauthorized biography_](http://simplystatistics.org/2015/07/24/stringsasfactors-an-unauthorized-biography/) by Roger Peng, and [_stringsAsFactors = \<sigh\>_](http://notstatschat.tumblr.com/post/124987394001/stringsasfactors-sigh) by Thomas Lumley. If you want to learn more about other approaches to working with factors and categorical data, I recommend [_Wrangling categorical data in R_](https://peerj.com/preprints/3163/), by Amelia McNamara and Nicholas Horton.) These days, making factors automatically is no longer so helpful, so packages in the [tidyverse](http://tidyverse.org) never create them automatically.
2424

2525
However, factors are still useful when you have true categorical data, and when you want to override the ordering of character vectors to improve display. The goal of the __forcats__ package is to provide a suite of useful tools that solve common problems with factors. If you're not familiar with strings, the best place to start is the [chapter on factors](http://r4ds.had.co.nz/factors.html) in R for Data Science.
2626

README.md

+3-4
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ forcats <img src="logo.png" align="right" />
88
Overview
99
--------
1010

11-
R uses **factors** to handle categorical variables, variables that have a fixed and known set of possible values. Historically, factors were much easier to work with than character vectors, so many base R functions automatically convert character vectors to factors. (For more historical context, I recommend [*stringsAsFactors: An unauthorized biography*](http://simplystatistics.org/2015/07/24/stringsasfactors-an-unauthorized-biography/) by Roger Peng, and [*stringsAsFactors = &lt;sigh&gt;*](http://notstatschat.tumblr.com/post/124987394001/stringsasfactors-sigh) by Thomas Lumley.) These days, making factors automatically is no longer so helpful, so packages in the [tidyverse](http://tidyverse.org) never create them automatically.
11+
R uses **factors** to handle categorical variables, variables that have a fixed and known set of possible values. Historically, factors were much easier to work with than character vectors, so many base R functions automatically convert character vectors to factors. (For historical context, I recommend [*stringsAsFactors: An unauthorized biography*](http://simplystatistics.org/2015/07/24/stringsasfactors-an-unauthorized-biography/) by Roger Peng, and [*stringsAsFactors = &lt;sigh&gt;*](http://notstatschat.tumblr.com/post/124987394001/stringsasfactors-sigh) by Thomas Lumley. If you want to learn more about other approaches to working with factors and categorical data, I recommend [*Wrangling categorical data in R*](https://peerj.com/preprints/3163/), by Amelia McNamara and Nicholas Horton.) These days, making factors automatically is no longer so helpful, so packages in the [tidyverse](http://tidyverse.org) never create them automatically.
1212

1313
However, factors are still useful when you have true categorical data, and when you want to override the ordering of character vectors to improve display. The goal of the **forcats** package is to provide a suite of useful tools that solve common problems with factors. If you're not familiar with strings, the best place to start is the [chapter on factors](http://r4ds.had.co.nz/factors.html) in R for Data Science.
1414

@@ -66,11 +66,10 @@ factor(x2, month_levels)
6666

6767
parse_factor(x2, month_levels)
6868
#> Warning: 1 parsing failure.
69-
#> row col expected actual
70-
#> 3 -- value in level set Jam
69+
#> row # A tibble: 1 x 4 col row col expected actual expected <int> <int> <chr> <chr> actual 1 3 NA value in level set Jam
7170
#> [1] Dec Apr <NA> Mar
7271
#> attr(,"problems")
73-
#> # A tibble: 1 × 4
72+
#> # A tibble: 1 x 4
7473
#> row col expected actual
7574
#> <int> <int> <chr> <chr>
7675
#> 1 3 NA value in level set Jam

docs/index.html

+1-1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)