Skip to content

Update documentation to midwest variables #4274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 22 additions & 17 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -48,16 +48,21 @@

#' Midwest demographics
#'
#' Demographic information of midwest counties
#' Demographic information of midwest counties from 2000 US census
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#'
#' Note: this dataset is included for illustrative purposes. The original
#' descriptions were not documented and the current descriptions here are based
#' on speculation. For more accurate and up-to-date US census data, see the
#' [`acs` package](https://cran.r-project.org/package=acs).
#'
#' @format A data frame with 437 rows and 28 variables:
#' \describe{
#' \item{PID}{}
#' \item{county}{}
#' \item{state}{}
#' \item{area}{}
#' \item{poptotal}{Total population}
#' \item{popdensity}{Population density}
#' \item{PID}{Unique county identifier.}
#' \item{county}{County name.}
#' \item{state}{State to which county belongs to.}
#' \item{area}{Area of county (units unknown).}
Copy link
Contributor Author

@erictleung erictleung Nov 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for area, I tried looking up the land area units for this and nothing came out as something reasonable. So I've just made this units unknown.

The other variables here (PID, county, and state), I'm just spelling out more on what those variables are to be thorough.

#' \item{poptotal}{Total population.}
#' \item{popdensity}{Population density (person/unit area).}
#' \item{popwhite}{Number of whites.}
#' \item{popblack}{Number of blacks.}
#' \item{popamerindian}{Number of American Indians.}
Expand All @@ -69,17 +74,17 @@
#' \item{percasian}{Percent Asian.}
#' \item{percother}{Percent other races.}
#' \item{popadults}{Number of adults.}
#' \item{perchsd}{}
#' \item{perchsd}{Percent with high school diploma.}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very reliable, but I found someone's homework that lists perchsd as "the percentage of people with a high schooldiploma in each county".

I couldn't verify this, but given the surrounding other variables on education status (percollege and percprof), I figured this to be a high school diploma.

#' \item{percollege}{Percent college educated.}
#' \item{percprof}{Percent profession.}
#' \item{poppovertyknown}{}
#' \item{percpovertyknown}{}
#' \item{percbelowpoverty}{}
#' \item{percchildbelowpovert}{}
#' \item{percadultpoverty}{}
#' \item{percelderlypoverty}{}
#' \item{inmetro}{In a metro area.}
#' \item{category}{}
#' \item{percprof}{Percent with professional degree.}
Copy link
Contributor Author

@erictleung erictleung Nov 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For percprof, I got the same hint from that one homework assignment. And with the context of the other variables (perchsd and percollege), this would make more sense that this is a professional degree rather than merely "profession". The percentages of professional degrees in the data set are much lower than the percollege variable, which is a good sanity check.

#' \item{poppovertyknown}{Population with known poverty status.}
#' \item{percpovertyknown}{Percent of population with known poverty status.}
#' \item{percbelowpoverty}{Percent of people below poverty line.}
#' \item{percchildbelowpovert}{Percent of children below poverty line.}
#' \item{percadultpoverty}{Percent of adults below poverty line.}
#' \item{percelderlypoverty}{Percent of elderly below poverty line.}
Comment on lines +80 to +85
Copy link
Contributor Author

@erictleung erictleung Nov 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poppovertyknown is always less than population total poptotal. So I personally inferred that this meant how many of these individuals do we actually know their poverty status for.

This is validated with some manual calculations to check if our manual calculation of percent poverty is equal to the one in the data set.

library(ggplot2)
library(dplyr)

data("midwest")

# Find number of counties our manual calculation fails
midwest %>%
  select(poptotal, poppovertyknown, percpovertyknown) %>%
  mutate(manual_percpoverty = poppovertyknown / poptotal * 100) %>%
  mutate(not_equal_per = !all.equal(percpovertyknown, manual_percpoverty)) %>%
  pull(not_equal_per) %>%
  sum()
#> [1] 0

Created on 2020-11-27 by the reprex package (v0.3.0)

The remaining variables (percbelowpoverty, percchildbelowpovert, percadultpoverty, and percelderlypoverty) were my own personal inferences on the pieces of information based on the variable name itself. The ages for poverty (child, adult, elderly) are also seen in poverty reports (PDF) as "Under 18 years", "18 to 64 years", and "65 years and over".

#' \item{inmetro}{County considered in a metro area.}
Copy link
Contributor Author

@erictleung erictleung Nov 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For inmetro, this is merely giving more detail on what is in a metro area, which is the county/

#' \item{category}{Miscellaneous.}
Copy link
Contributor Author

@erictleung erictleung Nov 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For category, I could not find out what this was but I didn't want to leave this blank, so I just gave it this name. Happy to remove this if it is pointless.

#' }
#'
"midwest"
Expand Down
40 changes: 23 additions & 17 deletions man/midwest.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.