Skip to content

Commit

Permalink
Switch from stations to station() to allow updating of stations list (f…
Browse files Browse the repository at this point in the history
…ixes #10 finally!)
  • Loading branch information
steffilazerte committed Apr 16, 2021
1 parent f2b95ef commit 159d42d
Show file tree
Hide file tree
Showing 25 changed files with 429 additions and 188 deletions.
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# Generated by roxygen2: do not edit by hand

export(normals_dl)
export(stations)
export(stations_dl)
export(stations_meta)
export(stations_search)
export(weather_dl)
export(weather_interp)
Expand Down
8 changes: 6 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,18 @@
# weathercan (development version)
# weathercan 0.6.0

## Big changes
- Move from data frame `stations` to function `stations()`. Returns same data
but is updateable with `stations_dl()` and you can check download dates
version with `stations_meta()`
- Download climate normals from climate.weather.gc.ca
- More stations available
- More stations available (more than 2x as many!)
- More year ranges available (1981-2010 and 1971-2000;
Note that while climate normals from 1961-1990 are available, they
don't have climate ids making it tricky to download reliably)

## Small changes
- Remove old deprecated function arguments
- Add more test coverage

## Bug fixes
- Download stations data frame from google drive rather than FTP site
Expand Down
30 changes: 0 additions & 30 deletions R/data.R
Original file line number Diff line number Diff line change
@@ -1,33 +1,3 @@
#' Station data downloaded from Environment and Climate Change Canada
#'
#' A dataset containing station information downloaded from Environment and
#' Climate Change Canada. Note that a station may have several station IDs,
#' depending on how the data collection has changed over the years. Station
#' information can be updated by running \code{stations_new <- stations_dl()}
#' and then by specifying stn = stations_new in most functions.
#'
#' @format A data frame with 26211 rows and 12 variables:
#' \describe{
#' \item{prov}{Province}
#' \item{station_name}{Station name}
#' \item{station_id}{Environment Canada's station ID number. Required for
#' downloading station data.}
#' \item{climate_id}{Climate ID number}
#' \item{WMO_id}{Climate ID number}
#' \item{TC_id}{Climate ID number}
#' \item{lat}{Latitude of station location in degree decimal format}
#' \item{lon}{Longitude of station location in degree decimal format}
#' \item{elev}{Elevation of station location in metres}
#' \item{tz}{Local timezone excluding any Daylight Savings}
#' \item{interval}{Interval of the data measurements ('hour', 'day', 'month')}
#' \item{start}{Starting year of data record}
#' \item{end}{Ending year of data record}
#' \item{normals}{Whether current climate normals are available for that station}
#' \item{normals_1981_2010}{Whether 1981-2010 climate normals are available for that station}
#' \item{normals_1971_2000}{Whether 1981-2010 climate normals are available for that station}
#' }
#' @source \url{https://climate.weather.gc.ca/index_e.html}
"stations"

#' Hourly weather data for Kamloops
#'
Expand Down
9 changes: 8 additions & 1 deletion R/normals.R
Original file line number Diff line number Diff line change
Expand Up @@ -81,9 +81,16 @@
#' @export

normals_dl <- function(climate_ids, normals_years = "1981-2010",
format = TRUE, stn = weathercan::stations,
format = TRUE, stn = NULL,
verbose = FALSE, quiet = FALSE) {

if(!is.null(stn)){
stop("`stn` is defunct, to use an updated stations data frame ",
"use `stations_dl()` to update the internal data, and ",
"`stations_meta()` to check when it was last updated", call. = FALSE)
}
stn <- stations()

check_ids(climate_ids, stn, type = "climate_id")
check_normals(normals_years)

Expand Down
144 changes: 126 additions & 18 deletions R/stations.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,81 @@

#' Access Station data downloaded from Environment and Climate Change Canada
#'
#' This function access the built-in stations data frame. You can update this
#' data frame with `stations_dl()` which will update the locally stored data.
#'
#' You can check when this was last updated with `stations_meta()`.
#'
#' @details
#'
#' A dataset containing station information downloaded from Environment and
#' Climate Change Canada. Note that a station may have several station IDs,
#' depending on how the data collection has changed over the years. Station
#' information can be updated by running `stations_dl()`.
#'
#' @format A data frame:
#' \describe{
#' \item{prov}{Province}
#' \item{station_name}{Station name}
#' \item{station_id}{Environment Canada's station ID number. Required for
#' downloading station data.}
#' \item{climate_id}{Climate ID number}
#' \item{WMO_id}{Climate ID number}
#' \item{TC_id}{Climate ID number}
#' \item{lat}{Latitude of station location in degree decimal format}
#' \item{lon}{Longitude of station location in degree decimal format}
#' \item{elev}{Elevation of station location in metres}
#' \item{tz}{Local timezone excluding any Daylight Savings}
#' \item{interval}{Interval of the data measurements ('hour', 'day', 'month')}
#' \item{start}{Starting year of data record}
#' \item{end}{Ending year of data record}
#' \item{normals}{Whether current climate normals are available for that station}
#' \item{normals_1981_2010}{Whether 1981-2010 climate normals are available for that station}
#' \item{normals_1971_2000}{Whether 1981-2010 climate normals are available for that station}
#' }
#' @source \url{https://climate.weather.gc.ca/index_e.html}
#'
#' @return
#' @export
#'
#' @examples
#'
#' stations()
#' stations_meta()
#'
#' library(dplyr)
#' filter(stations(), interval == "hour", normals == TRUE, province = "MB")
#'

#'
stations <- function() {

if(abs(difftime(stations_meta()$weathercan_modified,
Sys.Date(), units = "days")) > 28) {
message("The stations data frame hasn't been updated in over 4 weeks. ",
"Consider running `stations_dl()` to update it so you have the ",
"most recent stations list available")
}

stations_read()$stn
}

#' Show stations list meta data
#'
#' Date of ECCC update and date downloaded via weathercan.
#'
#' @export
#'
#' @examples
#' stations_meta()
stations_meta <- function() {
stations_read()$meta
}

stations_read <- function() {
readr::read_rds(system.file("extdata", "stations.rds", package = "weathercan"))
}

#' Get available stations
#'
#' This function can be used to download a Station Inventory CSV file from
Expand Down Expand Up @@ -49,8 +127,16 @@
#' @export

stations_dl <- function(skip = NULL, verbose = FALSE, quiet = FALSE) {
stations_dl_internal(skip = skip, verbose = verbose, quiet = quiet,
loc = system.file("extdata", package = "weathercan"))
}

stations_dl_internal <- function(skip = NULL, verbose = FALSE, quiet = FALSE,
loc = NULL) {

if(getRversion() <= "3.3.3") {
# If called internally use inst
if(is.null(loc)) loc <- system.file("inst", "extdata", package = "weathercan")
if(getRversion() < "3.3.4") {
message("Need R version 3.3.4 or greater to update the stations data")
return()
}
Expand Down Expand Up @@ -88,7 +174,13 @@ stations_dl <- function(skip = NULL, verbose = FALSE, quiet = FALSE) {
}

if(!quiet) message("According to Environment Canada, ",
grep("Modified Date", headings, value = TRUE))
stringr::str_subset(headings, "Modified Date") %>%
stringr::str_remove_all("[^\001-\177]"))

eccc_meta <- stringr::str_subset(headings, "Modified Date") %>%
stringr::str_remove(stringr::regex("Modified Date:", ignore_case = TRUE)) %>%
lubridate::ymd_hms(truncated = 3)

if(!quiet) {
disclaimer <- paste0(grep("Disclaimer", headings, value = TRUE),
collapse = "\n")
Expand Down Expand Up @@ -154,12 +246,25 @@ stations_dl <- function(skip = NULL, verbose = FALSE, quiet = FALSE) {
dplyr::arrange(.data$prov, .data$station_id, .data$interval) %>%
dplyr::as_tibble()

s %>%
s <- s %>%
dplyr::left_join(normals, by = c("station_name", "climate_id")) %>%
dplyr::mutate(dplyr::across(dplyr::contains("normals"),
~tidyr::replace_na(., FALSE)),
normals = .data$normals_1981_2010) %>%
dplyr::relocate(dplyr::contains("normals_"), .after = dplyr::last_col())


stn <- list(stn = s,
meta = list(ECCC_modified = eccc_meta,
weathercan_modified = Sys.Date()))

f <- file.path(loc, "stations.rds")
if(verbose) message("Saving stations data to ", f)
readr::write_rds(x = stn, file = f, compress = "gz")

if(!quiet) message("Stations data saved...\n",
"Use `stations()` to access most recent version and ",
"`stations_meta()` to see when this was last updated")
}

#' Search for stations by name or location
Expand All @@ -186,15 +291,15 @@ stations_dl <- function(skip = NULL, verbose = FALSE, quiet = FALSE) {
#' recent normals year range. Default `NULL` does not filter by climate
#' normals. Specific year ranges return stations with normals in that period.
#' See Details for more specifics.
#' @param stn Data frame. The \code{stations} data frame to use. Will use the
#' one included in the package unless otherwise specified.
#' @param starts_latest Numeric. Restrict results to stations with data collection
#' beginning in or before the specified year.
#' @param ends_earliest Numeric. Restrict results to stations with data collection
#' ending in or after the specified year.
#' @param starts_latest Numeric. Restrict results to stations with data
#' collection beginning in or before the specified year.
#' @param ends_earliest Numeric. Restrict results to stations with data
#' collection ending in or after the specified year.
#' @param verbose Logical. Include progress messages
#' @param quiet Logical. Suppress all messages (including messages regarding
#' missing data, etc.)
#' @param stn DEFUNCT. Now use `stations_dl()` to update internal data and
#' `stations_meta()` to check the date it was last updated.
#'
#' @details To search by coordinates, users must make sure they have the
#' [sp](https://cran.r-project.org/package=sp) package installed.
Expand Down Expand Up @@ -229,32 +334,34 @@ stations_search <- function(name = NULL,
interval = c("hour", "day", "month"),
normals_years = NULL,
normals_only = NULL,
stn = weathercan::stations,
stn = NULL,
starts_latest = NULL,
ends_earliest = NULL,
verbose = FALSE,
quiet = FALSE) {

if(!is.null(normals_only)) {
warning("`normals_only` is deprecated, use ",
"`normals_years` instead",
.call = FALSE)
warning("`normals_only` is deprecated, switching to ",
"`normals_years = 'current'`", .call = FALSE)
normals_years <- "current"
}
if(!is.null(normals_years) &&
!normals_years %in% c("current", "1981-2010", "1971-2000")) {
stop("`normals_years` must either be `NULL` (don't filter by normals),",
"'1981-2010' or '1971-2000'", call. = FALSE)
"'current', '1981-2010' or '1971-2000'", call. = FALSE)
}

if(all(is.null(name), is.null(coords)) |
all(!is.null(name), !is.null(coords))) {
stop("Need a search name OR search coordinate")
}

if(is.null(stn)) {
stn <- weathercan::stations
message("No valid stn data frame supplied, using built-in")
if(!is.null(stn)){
stop("`stn` is defunct, to use an updated stations data frame ",
"use `stations_dl()` to update the internal data, and ",
"`stations_meta()` to check when it was last updated", call. = FALSE)
}
stn <- stations()

if(!is.null(coords)) {
suppressWarnings({
Expand All @@ -274,7 +381,8 @@ stations_search <- function(name = NULL,

check_int(interval)

stn <- dplyr::filter(stn, .data$interval %in% !! interval, !is.na(.data$start))
stn <- dplyr::filter(stations(),
.data$interval %in% !! interval, !is.na(.data$start))

if(!is.null(normals_years)) {
yr <- "normals"
Expand Down
14 changes: 11 additions & 3 deletions R/weather.R
Original file line number Diff line number Diff line change
Expand Up @@ -62,14 +62,15 @@
#' @param string_as Character. What value to replace character strings in a
#' numeric measurement with. See Details.
#' @param time_disp Character. Either "none" (default) or "UTC". See details.
#' @param stn Data frame. The \code{stations} data frame to use. Will use the
#' one included in the package unless otherwise specified.
#' @param encoding Character. Text encoding for download.
#' @param list_col Logical. Return data as nested data set? Defaults to FALSE.
#' Only applies if `format = TRUE`
#' @param verbose Logical. Include progress messages
#' @param quiet Logical. Suppress all messages (including messages regarding
#' missing data, etc.)
#' @param stn DEFUNCT. Now use `stations_dl()` to update internal data and
#' `stations_meta()` to check the date it was last updated.

#'
#' @return A tibble with station ID, name and weather data.
#'
Expand Down Expand Up @@ -104,7 +105,7 @@ weather_dl <- function(station_ids,
format = TRUE,
string_as = NA,
time_disp = "none",
stn = weathercan::stations,
stn = NULL,
encoding = "UTF-8",
list_col = FALSE,
verbose = FALSE,
Expand All @@ -123,6 +124,13 @@ weather_dl <- function(station_ids,
stop("'interval' must be either 'hour', 'day', OR 'month'")
}

if(!is.null(stn)){
stop("`stn` is defunct, to use an updated stations data frame ",
"use `stations_dl()` to update the internal data, and ",
"`stations_meta()` to check when it was last updated", call. = FALSE)
}
stn <- stations()

check_int(interval)

w_all <- data.frame()
Expand Down
8 changes: 8 additions & 0 deletions R/weathercan-pkg.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
.onAttach <- function(libname, pkgname) {
packageStartupMessage(
"weathercan v", utils::packageVersion("weathercan"), "\n",
"The included data `stations` has been ",
"deprecated in favour of the function `stations()`.\n",
"See ?stations for more details.")
}

#' Easy downloading of weather data from Environment and Climate Change Canada
#'
#' \code{weathercan} is an R package for simplifying the downloading of
Expand Down
23 changes: 19 additions & 4 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -63,11 +63,11 @@ To download data, you first need to know the `station_id` associated with the st

### Stations

`weathercan` includes a data frame called `stations` which includes a list of stations and their details (including `station_id`.
`weathercan` includes the function `stations()` which returns a list of stations and their details (including `station_id`).

```{r}
head(stations)
glimpse(stations)
head(stations())
glimpse(stations())
```

You can look through this data frame directly, or you can use the `stations_search` function:
Expand All @@ -84,6 +84,21 @@ You can also search by proximity:
stations_search(coords = c(50.667492, -120.329049), dist = 20, interval = "hour")
```

You can update this list of stations with

```{r}
stations_dl()
```

And check when it was last updated with
```{r}
stations_meta()
```

**Note:** For reproducibility, if you are using the stations list to gather your
data, it can be a good idea to take note of the ECCC date of modification and
include it in your reports/manuscripts.

### Weather

Once you have your `station_id`(s) you can download weather data:
Expand Down Expand Up @@ -135,7 +150,7 @@ The data and the code in this repository are licensed under multiple licences. A

**[`CHCN`](https://cran.r-project.org/package=CHCN)**

`CHCN` is an older package last updated in 2012. Unfortunately, ECCC updated their services within the last couple of years which caused a great many of the previous web scrapers to fail. `CHCN` relies on a decommisioned [older web-scraper](https://quickcode.io/) and so is currently broken.
`CHCN` is an older package last updated in 2012. Unfortunately, ECCC updated their services within the last couple of years which caused a great many of the previous web scrapers to fail. `CHCN` relies on a decommissioned [older web-scraper](https://quickcode.io/) and so is currently broken.

## Contributions

Expand Down
Loading

0 comments on commit 159d42d

Please sign in to comment.