tidyverse
diff --git a/‎.github/CODE_OF_CONDUCT.md
+25 b/‎.github/CODE_OF_CONDUCT.md
+25
diff --git a/‎README.Rmd
+39-19 b/‎README.Rmd
+39-19
diff --git a/‎README.md
+87-84 b/‎README.md
+87-84
diff --git a/‎man/figures/README-ordered-plot-1.png
27.3 KB b/‎man/figures/README-ordered-plot-1.png
27.3 KB
diff --git a/‎man/figures/README-unordered-plot-1.png
27.3 KB b/‎man/figures/README-unordered-plot-1.png
27.3 KB
@@ -0,0 +1,25 @@
+# Contributor Code of Conduct
+
+As contributors and maintainers of this project, we pledge to respect all people who 
+contribute through reporting issues, posting feature requests, updating documentation,
+submitting pull requests or patches, and other activities.
+
+We are committed to making participation in this project a harassment-free experience for
+everyone, regardless of level of experience, gender, gender identity and expression,
+sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion.
+
+Examples of unacceptable behavior by participants include the use of sexual language or
+imagery, derogatory comments or personal attacks, trolling, public or private harassment,
+insults, or other unprofessional conduct.
+
+Project maintainers have the right and responsibility to remove, edit, or reject comments,
+commits, code, wiki edits, issues, and other contributions that are not aligned to this 
+Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed 
+from the project team.
+
+Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by 
+opening an issue or contacting one or more of the project maintainers.
+
+This Code of Conduct is adapted from the Contributor Covenant 
+(http://contributor-covenant.org), version 1.0.0, available at 
+http://contributor-covenant.org/version/1/0/0/
@@ -8,7 +8,7 @@ output: github_document
 knitr::opts_chunk$set(
   collapse = TRUE,
   comment = "#>",
-  fig.path = "README-"
+  fig.path = "man/figures/README-"
 )
 ```
 
@@ -22,13 +22,18 @@ knitr::opts_chunk$set(
 
 ## Overview
 
-R uses __factors__ to handle categorical variables, variables that have a fixed and known set of possible values. Historically, factors were much easier to work with than character vectors, so many base R functions automatically convert character vectors to factors. (For historical context, I recommend [_stringsAsFactors: An unauthorized biography_](http://simplystatistics.org/2015/07/24/stringsasfactors-an-unauthorized-biography/) by Roger Peng, and [_stringsAsFactors = \<sigh\>_](http://notstatschat.tumblr.com/post/124987394001/stringsasfactors-sigh) by Thomas Lumley.  If you want to learn more about other approaches to working with factors and categorical data, I recommend [_Wrangling categorical data in R_](https://peerj.com/preprints/3163/), by Amelia McNamara and Nicholas Horton.) These days, making factors automatically is no longer so helpful, so packages in the [tidyverse](http://tidyverse.org) never create them automatically.
+R uses __factors__ to handle categorical variables, variables that have a fixed and known set of possible values. Factors are also helpful for reordering character vectors to improve display. The goal of the __forcats__ package is to provide a suite of tools that solve common problems with factors, including changing the order of levels or the values. Some examples include: 
 
-However, factors are still useful when you have true categorical data, and when you want to override the ordering of character vectors to improve display. The goal of the __forcats__ package is to provide a suite of useful tools that solve common problems with factors. If you're not familiar with strings, the best place to start is the [chapter on factors](http://r4ds.had.co.nz/factors.html) in R for Data Science.
+* `fct_reorder()`: Reordering a factor by another variable.
+* `fct_infreq()`: Reordering a factor by the frequency of values.
+* `fct_relevel()`: Changing the order of a factor by hand.
+* `fct_lump()`: Collapsing the least/most frequent values of a factor into "other".
+
+You can learn more about each of these in `vignette("forcats")`. If you're new to factors, the best place to start is the [chapter on factors](http://r4ds.had.co.nz/factors.html) in R for Data Science.
 
 ## Installation
 
-```R
+```
 # The easiest way to get forcats is to install the whole tidyverse:
 install.packages("tidyverse")
 
@@ -46,30 +51,45 @@ forcats is part of the core tidyverse, so you can load it with `library(tidyvers
 
 ```{r setup, message = FALSE}
 library(forcats)
+library(dplyr)
+library(ggplot2)
 ```
 
-Factors are used to describe categorical variables with a fixed and known set of __levels__. You can create factors with the base `factor()` or [`readr::parse_factor()`](http://readr.tidyverse.org/reference/parse_factor.html):
+```{r}
+starwars %>% 
+  filter(!is.na(species)) %>%
+  count(species, sort = TRUE)
+```
 
 ```{r}
-x1 <- c("Dec", "Apr", "Jan", "Mar")
-month_levels <- c(
-  "Jan", "Feb", "Mar", "Apr", "May", "Jun", 
-  "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
-)
+starwars %>%
+  filter(!is.na(species)) %>%
+  mutate(species = fct_lump(species, n = 3)) %>%
+  count(species)
+```
 
-factor(x1, month_levels)
+```{r unordered-plot}
+ggplot(starwars, aes(x = eye_color)) + 
+  geom_bar() + 
+  coord_flip()
+```
 
-readr::parse_factor(x1, month_levels)
+```{r ordered-plot}
+starwars %>%
+  mutate(eye_color = fct_infreq(eye_color)) %>%
+  ggplot(aes(x = eye_color)) + 
+  geom_bar() + 
+  coord_flip()
 ```
 
-The advantage of `parse_factor()` is that it will generate a warning if values of `x` are not valid levels:
+## More resources 
 
-```{r}
-x2 <- c("Dec", "Apr", "Jam", "Mar")
+For a history of factors, I recommend [_stringsAsFactors: An unauthorized biography_](http://simplystatistics.org/2015/07/24/stringsasfactors-an-unauthorized-biography/) by Roger Peng and [_stringsAsFactors = \<sigh\>_](http://notstatschat.tumblr.com/post/124987394001/stringsasfactors-sigh) by Thomas Lumley. If you want to learn more about other approaches to working with factors and categorical data, I recommend [_Wrangling categorical data in R_](https://peerj.com/preprints/3163/), by Amelia McNamara and Nicholas Horton. 
 
-factor(x2, month_levels)
+## Getting help
 
-readr::parse_factor(x2, month_levels)
-```
+If you encounter a clear bug, please file a minimal reproducible example on [github](https://github.com/tidyverse/forcats/issues). For questions and other discussion, please use [community.rstudio.com](https://community.rstudio.com/).
+
+## Code of Conduct
 
-Once you have the factor, forcats provides helpers for solving common problems. 
+Please note that the 'forcats' project is released with a [Contributor Code of Conduct](.github/CODE_OF_CONDUCT.md). By contributing to this project, you agree to abide by its terms.
@@ -1,107 +1,110 @@
 
 <!-- README.md is generated from README.Rmd. Please edit that file -->
-
-# forcats <img src='man/figures/logo.png' align="right" height="139" />
+forcats <img src='man/figures/logo.png' align="right" height="139" />
+=====================================================================
 
 <!-- badges: start -->
+[![CRAN status](https://www.r-pkg.org/badges/version/forcats)](https://cran.r-project.org/package=forcats) [![Travis build status](https://travis-ci.org/tidyverse/forcats.svg?branch=master)](https://travis-ci.org/tidyverse/forcats) [![Codecov test coverage](https://codecov.io/gh/tidyverse/forcats/branch/master/graph/badge.svg)](https://codecov.io/gh/tidyverse/forcats?branch=master) <!-- badges: end -->
 
-[![CRAN
-status](https://www.r-pkg.org/badges/version/forcats)](https://cran.r-project.org/package=forcats)
-[![Travis build
-status](https://travis-ci.org/tidyverse/forcats.svg?branch=master)](https://travis-ci.org/tidyverse/forcats)
-[![Codecov test
-coverage](https://codecov.io/gh/tidyverse/forcats/branch/master/graph/badge.svg)](https://codecov.io/gh/tidyverse/forcats?branch=master)
-<!-- badges: end -->
-
-## Overview
-
-R uses **factors** to handle categorical variables, variables that have
-a fixed and known set of possible values. Historically, factors were
-much easier to work with than character vectors, so many base R
-functions automatically convert character vectors to factors. (For
-historical context, I recommend [*stringsAsFactors: An unauthorized
-biography*](http://simplystatistics.org/2015/07/24/stringsasfactors-an-unauthorized-biography/)
-by Roger Peng, and [*stringsAsFactors =
-\<sigh\>*](http://notstatschat.tumblr.com/post/124987394001/stringsasfactors-sigh)
-by Thomas Lumley. If you want to learn more about other approaches to
-working with factors and categorical data, I recommend [*Wrangling
-categorical data in R*](https://peerj.com/preprints/3163/), by Amelia
-McNamara and Nicholas Horton.) These days, making factors automatically
-is no longer so helpful, so packages in the
-[tidyverse](http://tidyverse.org) never create them automatically.
-
-However, factors are still useful when you have true categorical data,
-and when you want to override the ordering of character vectors to
-improve display. The goal of the **forcats** package is to provide a
-suite of useful tools that solve common problems with factors. If you’re
-not familiar with strings, the best place to start is the [chapter on
-factors](http://r4ds.had.co.nz/factors.html) in R for Data Science.
-
-## Installation
+Overview
+--------
 
-``` r
-# The easiest way to get forcats is to install the whole tidyverse:
-install.packages("tidyverse")
+R uses **factors** to handle categorical variables, variables that have a fixed and known set of possible values. Factors are also helpful for reordering character vectors to improve display. The goal of the **forcats** package is to provide a suite of tools that solve common problems with factors, including changing the order of levels or the values. Some examples include:
 
-# Alternatively, install just forcats:
-install.packages("forcats")
+-   `fct_reorder()`: Reordering a factor by another variable.
+-   `fct_infreq()`: Reordering a factor by the frequency of values.
+-   `fct_relevel()`: Changing the order of a factor by hand.
+-   `fct_lump()`: Collapsing the least/most frequent values of a factor into "other".
 
-# Or the the development version from GitHub:
-# install.packages("devtools")
-devtools::install_github("tidyverse/forcats")
-```
+You can learn more about each of these in `vignette("forcats")`. If you're new to factors, the best place to start is the [chapter on factors](http://r4ds.had.co.nz/factors.html) in R for Data Science.
+
+Installation
+------------
+
+    # The easiest way to get forcats is to install the whole tidyverse:
+    install.packages("tidyverse")
 
-## Getting started
+    # Alternatively, install just forcats:
+    install.packages("forcats")
 
-forcats is part of the core tidyverse, so you can load it with
-`library(tidyverse)` or `library(forcats)`.
+    # Or the the development version from GitHub:
+    # install.packages("devtools")
+    devtools::install_github("tidyverse/forcats")
+
+Getting started
+---------------
+
+forcats is part of the core tidyverse, so you can load it with `library(tidyverse)` or `library(forcats)`.
 
 ``` r
 library(forcats)
+library(dplyr)
+library(ggplot2)
+```
+
+``` r
+starwars %>% 
+  filter(!is.na(species)) %>%
+  count(species, sort = TRUE)
+#> # A tibble: 37 x 2
+#>    species      n
+#>    <chr>    <int>
+#>  1 Human       35
+#>  2 Droid        5
+#>  3 Gungan       3
+#>  4 Kaminoan     2
+#>  5 Mirialan     2
+#>  6 Twi'lek      2
+#>  7 Wookiee      2
+#>  8 Zabrak       2
+#>  9 Aleena       1
+#> 10 Besalisk     1
+#> # … with 27 more rows
 ```
 
-Factors are used to describe categorical variables with a fixed and
-known set of **levels**. You can create factors with the base `factor()`
-or
-[`readr::parse_factor()`](http://readr.tidyverse.org/reference/parse_factor.html):
+``` r
+starwars %>%
+  filter(!is.na(species)) %>%
+  mutate(species = fct_lump(species, n = 3)) %>%
+  count(species)
+#> # A tibble: 4 x 2
+#>   species     n
+#>   <fct>   <int>
+#> 1 Droid       5
+#> 2 Gungan      3
+#> 3 Human      35
+#> 4 Other      39
+```
 
 ``` r
-x1 <- c("Dec", "Apr", "Jan", "Mar")
-month_levels <- c(
-  "Jan", "Feb", "Mar", "Apr", "May", "Jun", 
-  "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
-)
-
-factor(x1, month_levels)
-#> [1] Dec Apr Jan Mar
-#> Levels: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
-
-readr::parse_factor(x1, month_levels)
-#> [1] Dec Apr Jan Mar
-#> Levels: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
+ggplot(starwars, aes(x = eye_color)) + 
+  geom_bar() + 
+  coord_flip()
 ```
 
-The advantage of `parse_factor()` is that it will generate a warning if
-values of `x` are not valid levels:
+![](man/figures/README-unordered-plot-1.png)
 
 ``` r
-x2 <- c("Dec", "Apr", "Jam", "Mar")
-
-factor(x2, month_levels)
-#> [1] Dec  Apr  <NA> Mar 
-#> Levels: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
-
-readr::parse_factor(x2, month_levels)
-#> Warning: 1 parsing failure.
-#> row # A tibble: 1 x 4 col     row   col expected           actual expected   <int> <int> <chr>              <chr>  actual 1     3    NA value in level set Jam
-#> [1] Dec  Apr  <NA> Mar 
-#> attr(,"problems")
-#> # A tibble: 1 x 4
-#>     row   col expected           actual
-#>   <int> <int> <chr>              <chr> 
-#> 1     3    NA value in level set Jam   
-#> Levels: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
+starwars %>%
+  mutate(eye_color = fct_infreq(eye_color)) %>%
+  ggplot(aes(x = eye_color)) + 
+  geom_bar() + 
+  coord_flip()
 ```
 
-Once you have the factor, forcats provides helpers for solving common
-problems.
+![](man/figures/README-ordered-plot-1.png)
+
+More resources
+--------------
+
+For a history of factors, I recommend [*stringsAsFactors: An unauthorized biography*](http://simplystatistics.org/2015/07/24/stringsasfactors-an-unauthorized-biography/) by Roger Peng and [*stringsAsFactors = &lt;sigh&gt;*](http://notstatschat.tumblr.com/post/124987394001/stringsasfactors-sigh) by Thomas Lumley. If you want to learn more about other approaches to working with factors and categorical data, I recommend [*Wrangling categorical data in R*](https://peerj.com/preprints/3163/), by Amelia McNamara and Nicholas Horton.
+
+Getting help
+------------
+
+If you encounter a clear bug, please file a minimal reproducible example on [github](https://github.com/tidyverse/forcats/issues). For questions and other discussion, please use [community.rstudio.com](https://community.rstudio.com/).
+
+Code of Conduct
+---------------
+
+Please note that the 'forcats' project is released with a [Contributor Code of Conduct](.github/CODE_OF_CONDUCT.md). By contributing to this project, you agree to abide by its terms.