Skip to content

.data pronoun speedup #5731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 29, 2024
Merged

.data pronoun speedup #5731

merged 2 commits into from
Feb 29, 2024

Conversation

teunbrand
Copy link
Collaborator

@teunbrand teunbrand commented Feb 29, 2024

This PR aims to fix #5730.

Briefly, using the .data pronoun caused the plot to slow down by a non-trivial amount of time. This PR rectifies this by considering the pronoun as 'non-data'. To explain a bit further, here follows the diagnosis.

unrowname() throws errors with pronouns:

x <- rlang::eval_tidy(quote(.data), mtcars)
print(x)
#> <pronoun>

ggplot2:::unrowname(x)
#> Error in `ggplot2:::unrowname()`:
#> ! Can only remove rownames from <data.frame> and <matrix> objects.

Created on 2024-02-29 with reprex v2.1.0

And unrowname() is used to compare to see if a part of an expression evaluates to the plot data here:

ggplot2/R/aes.R

Lines 437 to 440 in bc3e401

tryCatch({
data_eval <- eval_tidy(x[[2]], data, env)
identical(unrowname(data_eval), unrowname(data))
}, error = function(err) FALSE)

As this is in a tryCatch() block, the errors that unrowname() throws for pronouns are silenced. However, formatting the error message is still relatively expensive.

In this PR, we circumvent this problem by recognising that the .data pronoun is not the same as the regular plot data, thereby bypassing the check in the lines above.

As a benchmark, we see that typical evaluation and with pronoun are in the same range with this PR:

devtools::load_all("~/packages/ggplot2")
#> ℹ Loading ggplot2

df <- mtcars
df$name <- rownames(mtcars)

p <- ggplot(df) +
  geom_point() +
  geom_text() +
  facet_grid(gear ~ cyl)

standard <- p + aes(x = mpg, y = disp, label = name, colour = factor(cyl))
pronoun  <- p + aes(
  x = .data[["mpg"]], 
  y = .data[["disp"]], 
  label = .data[["name"]],
  colour = factor(.data[["cyl"]])
)

bench::mark(
  ggplot_build(standard),
  ggplot_build(pronoun),
  check = FALSE, min_iterations = 10
)
#> # A tibble: 2 × 6
#>   expression                  min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>             <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 ggplot_build(standard)   67.4ms   68.4ms      14.7    6.17MB    14.7 
#> 2 ggplot_build(pronoun)      68ms   69.4ms      14.4    1.96MB     9.61

Created on 2024-02-29 with reprex v2.1.0

For reference, this is the same benchmark result, but on the current main branch:

#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 2 × 6
#>   expression                  min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>             <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 ggplot_build(standard)   67.5ms   71.4ms     13.8     6.17MB     6.91
#> 2 ggplot_build(pronoun)     175ms  180.1ms      5.52    4.48MB     5.52

@@ -430,7 +430,7 @@ alternative_aes_extract_usage <- function(x) {
}

extract_target_is_likely_data <- function(x, data, env) {
if (!is.name(x[[2]])) {
if (!is.name(x[[2]]) || identical(x[[2]], quote(.data))) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't find anything in rlang like is_pronoun() so we only escape .data here and not .env. I don't think .env is very common.

@teunbrand teunbrand requested a review from thomasp85 February 29, 2024 15:33
Copy link
Member

@thomasp85 thomasp85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This is perhaps the most efficient diff in terms of character changed to performance gained :-)

@teunbrand teunbrand merged commit 01018d3 into tidyverse:main Feb 29, 2024
@teunbrand teunbrand deleted the pronoun_speedup branch February 29, 2024 15:53
@aphalo
Copy link
Contributor

aphalo commented Feb 29, 2024

Thanks! You were extremely fast at fixing this!
Best wishes,
Pedro.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

.data pronouns slow down plots
3 participants