Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

usdatasets package #760

Open
2 of 15 tasks
jonthegeek opened this issue Oct 15, 2024 · 0 comments
Open
2 of 15 tasks

usdatasets package #760

jonthegeek opened this issue Oct 15, 2024 · 0 comments
Labels

Comments

@jonthegeek
Copy link
Collaborator

I haven't fully investigated this, but there's a new package on CRAN with a bunch of datasets: https://cran.r-project.org/package=usdatasets

We'll likely want to split this into a ticket per dataset (or maybe a collection of related datasets).

  • This dataset has not already been used in TidyTuesday.

  • The dataset will (probably) be less than 20MB when saved as a tidy CSV.

  • I can imagine a data visualization related to this dataset.

  • title: A short name for the dataset (to be used in the social media post, "This week we're exploring DATASET NAME"). Examples: "Canadian NHL Player Birth Dates", "R Consortium ISC Grants", or "Leap Day".

  • article: An example of the dataset being used, such as a blog post or a README about the dataset.

    • title: The title of the article.
    • url: The link to the article.
  • data_source: A source where the dataset can be downloaded.

    • title: The title of the source.
    • url: The link to the source.
  • images: One or more images related to the dataset. For each image, provide:

    • file: A url to download the image, or an attached file.
    • alt: Text that can serve as a replacement for the image for those who cannot see the image (whether through visual impairment or because the image does not load).
  • cleaning_script: A script to fetch and clean the data, resulting in one or more data.frames (or equivalent structures) that can be saved as CSVs.

# ADD SCRIPT HERE
  • data_dictionary: A description of each column in the dataset, including the column name, the data type of the column, and a description of the column.
variable class description
VARIABLE CLASS DESCRIPTION
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant