Skip to content
This repository has been archived by the owner on Feb 4, 2022. It is now read-only.

Commit

Permalink
Merge pull request #223 from jennybc/formulas
Browse files Browse the repository at this point in the history
expose formulas and (un)formatted numbers
  • Loading branch information
Jennifer (Jenny) Bryan committed Mar 15, 2016
2 parents 464d33a + 3917c5f commit fe81485
Show file tree
Hide file tree
Showing 46 changed files with 1,833 additions and 614 deletions.
8 changes: 8 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ export(gs_copy)
export(gs_delete)
export(gs_download)
export(gs_edit_cells)
export(gs_ff)
export(gs_ff_key)
export(gs_ff_url)
export(gs_ff_ws_feed)
export(gs_gap)
export(gs_gap_key)
export(gs_gap_url)
Expand All @@ -24,6 +28,10 @@ export(gs_gs)
export(gs_inspect)
export(gs_key)
export(gs_ls)
export(gs_mini_gap)
export(gs_mini_gap_key)
export(gs_mini_gap_url)
export(gs_mini_gap_ws_feed)
export(gs_new)
export(gs_read)
export(gs_read_cellfeed)
Expand Down
10 changes: 5 additions & 5 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,11 @@
* `gs_add_row()` now works for two-dimensional `input`, by calling itself once per row of `input` (#188, @jimhester).
* Updated the scope for the Drive API. It is possible that new/updated Drive functions will require a token obtained with the new scope. This could mean that tokens stored and loaded from file in a non-interactive environment will need to be remade.
* `gs_read_listfeed()` now supports parameters to manipulate data in the API call itself: `reverse` inverts row order, `orderby` selects a column to sort on, `sq` accepts a structured query to filter rows. (#17)
* We explicitly try to match the behavior and interface of `readr::read_csv()` for all data ingest. The read functions `gs_read()`, `gs_read_csv()`, and `gs_read_listfeed()` and the reshaper `gs_reshape_cellfeed()` should all return the same data frame when operating on the same worksheet. This should also match what `readr::read_csv()` would return on a `.csv` file exported from that worksheet.
- If you're not happy with the defaults, take control of `readr`-style data ingest via the `...` arguments of `gs_read*` or reshape functions. You can now specify `column_types`, `col_names`, `locale`, `na`, `trim_ws`, etc. here. The type conversion arguments for `gs_simplify_cellfeed()` have also changed accordingly.
- Column types: read the [`readr` vignette on column types](https://cran.r-project.org/web/packages/readr/vignettes/column-types.html) to better understand the automatic variable conversion behaviour and how to use the `col_types` argument to override it.
- Column names: Use the `col_names` argument instead of `header`. If character, it should provide actual names. If logical, `TRUE` implies that column names should be taken from the first row and `FALSE` requests column names like `X1`, `X2`, etc.
- `gs_read_listfeed()` doesn't use API-transformed column names anymore. They should now be the same as those from the other read functions.
* `gs_read_listfeed()` doesn't return API-mangled column names anymore. They should now be the same as those from the other read functions and what you see in the browser.
* `readr`-style data ingest: We explicitly try to match the interface of `readr::read_csv()`. The read functions `gs_read()`, `gs_read_csv()`, and `gs_read_listfeed()` and the reshaper `gs_reshape_cellfeed()` should all return the same data frame when operating on the same worksheet. And this should match what `readr::read_csv()` would return on a `.csv` file exported from that worksheet. The type conversion arguments for `gs_simplify_cellfeed()` have also changed accordingly.
- The `header` argument is no longer accepted.
- If you're not happy with the defaults, take control via the `...` arguments of `gs_read*` or reshape functions. You can now specify `column_types`, `col_names`, `locale`, `na`, `trim_ws`, etc. here.
- See the sections "Controlling data ingest, theory and practice" in the [the basic usage vignette](https://github.com/jennybc/googlesheets/blob/master/vignettes/basic-usage.md) for details and examples.
- `readr` exception #1: variables that consist entirely of missing values will be `NA` of the logical type, not `NA_character_`.
- `readr` exception #2: `googlesheets` will never return a data frame with `NA` as a variable name. Instead, it will create a dummy variable name, like `X5`.
- `readr` exception #3: All read/reshape functions accept `check.names`, in the spirit of `utils::read.table()`, which defaults to `FALSE`. If `TRUE`, variable names will be run through `make.names(..., unique = TRUE)`. (#208)
Expand Down
3 changes: 2 additions & 1 deletion R/gs_edit_cells.R
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,8 @@ gs_edit_cells <- function(ss, ws = 1, input = '', anchor = 'A1',
"inputValue" = update_value)))
}
update_entries <- cells_df %>%
dplyr::select_(quote(-cell_alt), quote(-cell_text)) %>%
dplyr::select_(quote(-cell_alt), quote(-value),
quote(-input_value), quote(-numeric_value)) %>%
purrr::pmap(f)

update_feed <-
Expand Down
108 changes: 92 additions & 16 deletions R/gs_example-sheet-setup.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,24 @@
assign("gap_purl",
"https://w3id.org/people/jennybc/googlesheets_gap_url",
envir = .gs_exsheets)
assign("gap_fallback_key", "1BzfL0kZUz1TsI5zxJF1WNF01IxvC67FbOJUiiGMZ_mQ",
assign("gap_fallback_key",
"1BzfL0kZUz1TsI5zxJF1WNF01IxvC67FbOJUiiGMZ_mQ",
envir = .gs_exsheets)

## persistent browser URL for mini gapminder example sheet
## (owned by rpackagetest) PLUS fall back key
assign("mini_gap_purl",
"https://w3id.org/people/jennybc/googlesheets_mini_gap_url",
envir = .gs_exsheets)
assign("ff_fallback_key", "1BMtx1V2pk2KG2HGANvvBOaZM4Jx1DUdRrFdEx-OJIGY",
envir = .gs_exsheets)

## persistent browser URL for formula and formatting example sheet
## (owned by rpackagetest) PLUS fall back key
assign("ff_purl",
"https://w3id.org/people/jennybc/googlesheets_ff_url",
envir = .gs_exsheets)
assign("ff_fallback_key", "132Ij_8ggTKVLnLqCOM3ima6mV9F8rmY7HEcR-5hjWoQ",
envir = .gs_exsheets)

#' Examples of Google Sheets
Expand All @@ -22,12 +39,13 @@ assign("gap_fallback_key", "1BzfL0kZUz1TsI5zxJF1WNF01IxvC67FbOJUiiGMZ_mQ",
#'
#' \item \href{https://w3id.org/people/jennybc/googlesheets_gap_url}{Gapminder
#' sheet}
#' \item \href{https://w3id.org/people/jennybc/googlesheets_mini_gap_url}{mini
#' Gapminder sheet}
#' \item \href{https://w3id.org/people/jennybc/googlesheets_ff_url}{Sheet with
#' numeric formatting and formulas}
#'
#' }
#'
#' @param visibility either "public" (the default) or "private"; used when
#' producing a worksheets feed
#'
#' @return the key, browser URL, worksheets feed or \code{\link{googlesheet}}
#' object corresponding to one of the example sheets
#'
Expand All @@ -38,38 +56,95 @@ assign("gap_fallback_key", "1BzfL0kZUz1TsI5zxJF1WNF01IxvC67FbOJUiiGMZ_mQ",
#' browseURL(gs_gap_url())
#' gs_gap_ws_feed() # not so interesting to a user!
#' gs_gap()
#'
#' gs_ff_key()
#' gs_ff_url()
#' gs_ff()
#' gs_browse(gs_ff())
#' }
#'
#' @name example-sheets
NULL

#' @rdname example-sheets
#' @describeIn example-sheets Gapminder sheet key
#' @export
gs_gap_key <- function() {

if(is.null(get0("gap_key", .gs_exsheets))) gs_example_resolve("gap")
if (is.null(get0("gap_key", .gs_exsheets))) gs_example_resolve("gap")
get("gap_key", envir = .gs_exsheets)

}

#' @rdname example-sheets
#' @describeIn example-sheets Gapminder sheet URL
#' @export
gs_gap_url <- function() gs_gap_key() %>% construct_url_from_key()

#' @rdname example-sheets
#' @describeIn example-sheets Gapminder sheet worksheets feed
#' @export
gs_gap_ws_feed <- function(visibility = "public") {
gs_gap_ws_feed <- function() {
gs_gap_key() %>%
construct_ws_feed_from_key(visibility)
construct_ws_feed_from_key(visibility = "public")
}

#' @rdname example-sheets
#' @describeIn example-sheets Gapminder sheet as registered googlesheet
#' @export
gs_gap <- function() {
gs_gap_key() %>%
gs_key(lookup = FALSE, verbose = FALSE)
}

#' @describeIn example-sheets mini Gapminder sheet key
#' @export
gs_mini_gap_key <- function() {
if (is.null(get0("mini_gap_key", .gs_exsheets))) gs_example_resolve("mini_gap")
get("mini_gap_key", envir = .gs_exsheets)
}

#' @describeIn example-sheets mini Gapminder sheet URL
#' @export
gs_mini_gap_url <- function() gs_mini_gap_key() %>% construct_url_from_key()

#' @describeIn example-sheets mini Gapminder sheet worksheets feed
#' @export
gs_mini_gap_ws_feed <- function() {
gs_mini_gap_key() %>%
construct_ws_feed_from_key(visibility = "public")
}

#' @describeIn example-sheets mini Gapminder sheet as registered googlesheet
#' @export
gs_mini_gap <- function() {
gs_mini_gap_key() %>%
gs_key(lookup = FALSE, verbose = FALSE)
}

#' @describeIn example-sheets Key to a sheet with numeric formatting and
#' formulas
#' @export
gs_ff_key <- function() {
if (is.null(get0("ff_key", .gs_exsheets))) gs_example_resolve("ff")
get("ff_key", envir = .gs_exsheets)
}

#' @describeIn example-sheets URL for a sheet with numeric formatting and
#' formulas
#' @export
gs_ff_url <- function() gs_ff_key() %>% construct_url_from_key()

#' @describeIn example-sheets Worksheets feed for a sheet with numeric
#' formatting and formulas
#' @export
gs_ff_ws_feed <- function() {
gs_ff_key() %>%
construct_ws_feed_from_key(visibility = "public")
}

#' @describeIn example-sheets Registered googlesheet for a sheet with numeric
#' formatting and formulas
#' @export
gs_ff <- function() {
gs_ff_key() %>%
gs_key(lookup = FALSE, verbose = FALSE)
}

## not exported
## attempt to resolve the persistent URL of an example sheet
gs_example_resolve <- function(ex) {
Expand All @@ -78,13 +153,14 @@ gs_example_resolve <- function(ex) {
ex_key <- paste(ex, "key", sep = "_")
ex_fallback_key <- paste(ex, "fallback_key", sep = "_")
req <- try(httr::GET(get(ex_purl, envir = .gs_exsheets)), silent = TRUE)
if(inherits(req, "response") && httr::status_code(req) == 200) {
if (inherits(req, "response") && httr::status_code(req) == 200) {
assign(ex_key, extract_key_from_url(req$url), envir = .gs_exsheets)
return(invisible(TRUE))
} else {
mpf(paste("googlesheets: can't resolve persistent URL for example sheet",
"\"%s\" online; falling back to static default.", ex))
assign(ex_key, ex_fallback_key, envir = .gs_exsheets)
"\"%s\" online; falling back to static default."), ex)
assign(ex_key, get(ex_fallback_key, envir = .gs_exsheets),
envir = .gs_exsheets)
return(invisible(FALSE))
}
}
32 changes: 20 additions & 12 deletions R/gs_read.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,20 @@
#' and transformation, but you can call always call them directly for finer
#' control.
#'
#' If the \code{range} argument is not specified, all data will be read via
#' \code{\link{gs_read_csv}}. Don't worry -- no intermediate \code{*.csv} files
#' are written! We just request the data from the Sheets API via the
#' \code{exportcsv} link.
#' If the \code{range} argument is not specified and \code{literal = TRUE}, all
#' data will be read via \code{\link{gs_read_csv}}. Don't worry -- no
#' intermediate \code{*.csv} files are written! We just request the data from
#' the Sheets API via the \code{exportcsv} link.
#'
#' If the \code{range} argument is specified, data will be read for the
#' targetted cells via \code{\link{gs_read_cellfeed}}, then reshaped with
#' \code{\link{gs_reshape_cellfeed}}.
#' If the \code{range} argument is specified or if \code{literal = FALSE}, data
#' will be read for the targetted cells via \code{\link{gs_read_cellfeed}}, then
#' reshaped and type converted with \code{\link{gs_reshape_cellfeed}}. See
#' \code{\link{gs_reshape_cellfeed}} for details.
#'
#' @template ss
#' @template ws
#' @template range
#' @template literal
#' @template read-ddd
#' @template verbose
#'
Expand All @@ -34,29 +36,35 @@
#' str(oceania_csv)
#' oceania_csv
#'
#' gs_read(gap_ss, ws = "Europe", n_max = 4, col_types = c("cccccc"))
#'
#' gs_read(gap_ss, ws = "Oceania", range = "A1:C4")
#' gs_read(gap_ss, ws = "Oceania", range = "R1C1:R4C3")
#' gs_read(gap_ss, ws = "Oceania", range = "R2C1:R4C3", col_names = FALSE)
#' gs_read(gap_ss, ws = "Oceania", range = "R2C5:R4C6",
#' col_names = c("thing_one", "thing_two"))
#' gs_read(gap_ss, ws = "Oceania", range = cell_limits(c(1, 4), c(1, 3)))
#' gs_read(gap_ss, ws = "Oceania", range = cell_limits(c(1, 3), c(1, 4)),
#' col_names = FALSE)
#' gs_read(gap_ss, ws = "Oceania", range = cell_rows(1:5))
#' gs_read(gap_ss, ws = "Oceania", range = cell_cols(4:6))
#' gs_read(gap_ss, ws = "Oceania", range = cell_cols("A:D"))
#' gs_read(gap_ss, ws = "Oceania", range = cell_rows(1), col_names = FALSE)
#'
#' ff_ss <- gs_ff() # register example sheet with formulas and formatted nums
#' gs_read(ff_ss) # almost all vars are character
#' gs_read(ff_ss, literal = FALSE) # more vars are properly numeric
#' }
#'
#' @export
gs_read <- function(
ss, ws = 1,
range = NULL,
range = NULL, literal = TRUE,
..., verbose = TRUE) {

if(is.null(range)) {
if (is.null(range) && literal) {
gs_read_csv(ss, ws = ws, ..., verbose = verbose)
} else {
gs_read_cellfeed(ss, ws = ws, range = range, ..., verbose = verbose) %>%
gs_reshape_cellfeed(..., verbose = verbose)
gs_reshape_cellfeed(literal = literal, ..., verbose = verbose)
}

}
41 changes: 20 additions & 21 deletions R/gs_read_cellfeed.R
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
#' Read data from cells
#'
#' This function consumes data via the "cell feed", which, as the name suggests,
#' retrieves data cell by cell. Note that the output is a \code{tbl_df} or
#' \code{data.frame} with \strong{one row per cell}. Consult the Google Sheets API documentation for
#' retrieves data cell by cell. Note that the output is a data frame with
#' \strong{one row per cell}. Consult the Google Sheets API documentation for
#' more details about
#' \href{https://developers.google.com/google-apps/spreadsheets/data#work_with_cell-based_feeds}{the
#' cell feed}.
Expand All @@ -20,9 +20,7 @@
#' Empty cells, even if "embedded" in a rectangular region of populated cells,
#' are not normally returned by the cell feed. This function won't return them
#' either when \code{return_empty = FALSE} (default), but will if you set
#' \code{return_empty = TRUE}. If you don't specify any limits AND you set
#' \code{return_empty = TRUE}, you could be in for a bit of a wait, as the feed
#' will return all cells, which defaults to 1000 rows and 26 columns.
#' \code{return_empty = TRUE}.
#'
#' @template ss
#' @template ws
Expand All @@ -43,12 +41,12 @@
#' @examples
#' \dontrun{
#' gap_ss <- gs_gap() # register the Gapminder example sheet
#' first_4_rows <-
#' col_4_and_above <-
#' gs_read_cellfeed(gap_ss, ws = "Asia", range = cell_limits(c(NA, 4)))
#' first_4_rows
#' gs_reshape_cellfeed(first_4_rows)
#' gs_reshape_cellfeed(gs_read_cellfeed(gap_ss, "Asia",
#' range = cell_limits(c(NA, 4), c(3, NA))))
#' col_4_and_above
#' gs_reshape_cellfeed(col_4_and_above)
#'
#' gs_read_cellfeed(gap_ss, range = "A2:F3")
#' }
#' @family data consumption functions
#'
Expand Down Expand Up @@ -100,7 +98,9 @@ gs_read_cellfeed <- function(
cell_alt = character(),
row = integer(),
col = integer(),
cell_text = character(),
value = character(),
input_value = character(),
numeric_value = character(),
edit_link = character(),
cell_id = character())
} else {
Expand All @@ -126,20 +126,19 @@ gs_read_cellfeed <- function(
col = ~xml2::xml_find_all(x, ".//gs:cell", ns) %>%
xml2::xml_attr("col") %>%
as.integer(),
cell_text = ~xml2::xml_find_all(x, ".//gs:cell", ns) %>%
xml2::xml_text()
value = ~xml2::xml_find_all(x, ".//gs:cell", ns) %>%
xml2::xml_text(),
input_value = ~xml2::xml_find_all(x, ".//gs:cell", ns) %>%
xml2::xml_attr("inputValue"),
numeric_value = ~xml2::xml_find_all(x, ".//gs:cell", ns) %>%
xml2::xml_attr("numericValue")
))
# see issue #19 about all the places cell data is (mostly redundantly)
# stored in the XML, such as: content_text = x$content$text,
# cell_inputValue = x$cell$.attrs["inputValue"], cell_numericValue =
# x$cell$.attrs["numericValue"], when/if we think about formulas
# explicitly, we will want to come back and distinguish between inputValue
# and numericValue
}

x <- x %>%
dplyr::select_(~ cell, ~ cell_alt, ~ row, ~ col, ~ cell_text,
~ edit_link, ~ cell_id)
dplyr::select_(~cell, ~cell_alt, ~row, ~col,
~value, ~input_value, ~numeric_value,
~edit_link, ~cell_id)

attr(x, "ws_title") <- this_ws$ws_title

Expand Down
Loading

0 comments on commit fe81485

Please sign in to comment.