Skip to content

No plyr #3013

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Dec 13, 2018
Merged

No plyr #3013

merged 13 commits into from
Dec 13, 2018

Conversation

thomasp85
Copy link
Member

This PR writes out all ply dependencies from ggplot2. The following strategies has been used:

**ply

Except for ddply all use of **ply functions has been substituted with base equivalents. ddply has been substituted with the new dapply function that mimics its API and behaviour. It is ~40% faster in normal use by avoiding the as.quoted call.

Utility functions

Use of split_indices, summarise, and colwise has been substituted with base equivalent approaches.

quickdf has been substituted with the already existing new_data_frame

empty is now exported so the vignette doesn't need to depend on the plyr version

defaults, unrowname, rename, revalue, id, count, join.keys, as.quoted, round_any, and rbind.fill has all been reimplemented. In a few circumstances they have been lifted as-is from plyr, but often they have been rewritten to reduce complexity by only targeting use in ggplot2. All reimplementations are either on par or slightly faster than the plyr counterparts

Currently, a selection of reimplementations has been exported as internal functions to make it easier for extension developers to write out plyr as well.

All rewritten functions are currently collected in compat-plyr.R but I think it makes sense to distribute them into files guided by their use rather than keep them there for historical reasons.

There are no unit tests yet but these will come once we have agreed on the implementations and approaches.

@thomasp85 thomasp85 requested a review from hadley November 23, 2018 11:23
@yutannihilation
Copy link
Member

r-devel complains about plyr in tests:

checking for unstated dependencies in ‘tests’ ... WARNING
'::' or ':::' import not declared from: ‘plyr’

(https://travis-ci.org/tidyverse/ggplot2/jobs/459483350#L1765)

It seems you need to replace two more plyr things:

  • ddply() in tests/testthat/helper-plot-data.r
  • arrange() in tests/testthat/test-fortify.r

@@ -0,0 +1,334 @@
# Adds missing elements to a vector from a default vector
defaults <- function(x, y) c(x, y[setdiff(names(y), names(x))])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this basically modify_list?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is more or less the opposite... in set operation terms modify_list() overwrites the intersection of elements, while defaults adds the complement elements

Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to not export any of these internal functions at this point - I don't want to lock us into an API that we don't want to support long time.

@thomasp85
Copy link
Member Author

I've unexported the plyr substitutes... Unless there are other comments @hadley I'll merge and wave goodbye to plyr

@hadley
Copy link
Member

hadley commented Dec 13, 2018

Sounds good.

@thomasp85 thomasp85 merged commit c84d9a0 into tidyverse:master Dec 13, 2018
@jcheng5
Copy link

jcheng5 commented Dec 20, 2018

Thank you for this @thomasp85! I was profiling one of my ggplots to try to speed it up, and noticed a bunch of time spent in plyr, but when I came to look for the source code on master I saw that you'd made this change. Just by installing from master, the time dropped from 1020ms to 720ms!

image

@thomasp85
Copy link
Member Author

Great to hear, though a lot of the performance improvements is probably due to text dimension caching implemented in #2993...

You should get an additional speedup by installing the development version of gtable

@lock
Copy link

lock bot commented Jul 1, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Jul 1, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants