Skip to content

Allow passing a function as data #1486

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from
Closed

Conversation

thomasp85
Copy link
Member

This PR will make it possible to supply a function as the data argument to the layer function. The function will be applied to the global data of the plot and the result will be used in the layer.

The change is proposed to give extension developers just a tad more power when building custom features. Currently (to my knowledge) data can only be intercepted by a compute_* in a stat, and modifications have already been applied at this point.

My personal need for this feature is that I'm developing a ggplot2 extension library for dealing with graph/network data, and this type of data is not really fitting for a single data.frame as normally accepted in ggplot2. My current solution (which involves this PR) is to convert the graph into two data.frames: one describing the nodes and one describing the edges. The edge data.frame will be added as an attribute to the node data.frame. On calls to edge specific geoms the layer will get a function extracting the edges, passed as the data argument ensuring that it is the edge data that is being made available to the stats and geoms.

The changes are quite minor and does not change ggplot2 behaviour unless a function is passed to the layer. In that case the call to fortify is pushed down from layer() to map_layout() where the function is called on the global data (plot_data).

This will make it possible to pass a function as the data argument that will modify the global data for the layer. It is indeed an esoteric use case, but for ggplot2 extension developers it is the only way to intercept the global data prior to passing it through the relevant Stat object.

Furthermore the change is really minor and cheap. If you dislike the approach please consider another way for extension builders to modify the data prior to Stat...
@thomasp85
Copy link
Member Author

Have just found a problem with the use of facets. Will ned to make some changes

@thomasp85
Copy link
Member Author

It has been fixed now. The function is now applied during ggplot_build as part of data extraction from the layers


if (!is.function(data)) {
data <- fortify(data)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just add fortify.function <- function(x, data) x ?

@thomasp85
Copy link
Member Author

@hadley The documentation stated that fortify might become deprecated in the future, so I chose to just ignore that approach. If fortify will remain I can definitely work around by creating fortify methods...

@@ -22,7 +22,13 @@ ggplot_build <- function(plot) {
}

layers <- plot$layers
layer_data <- lapply(layers, function(y) y$data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't the right place either - if you wanted to preserve this basic approach it would be better to add a get_data() method (probably needs a better name) to the layer object.

@hadley
Copy link
Member

hadley commented Jan 25, 2016

Fortify is here for now, so you should use it. If it changes in the future, the unit tests should pick it up.

On that note, you also need some unit tests ;)

@thomasp85
Copy link
Member Author

I'll use fortify then. Ignore this PR.

Ah, yes - the unit tests. They will happen eventually, but I needed to make sure that I had the general structure in place. Also, they tend to make me loose interest in a project due to their dreadful nature :-) Having published mainly through Bioconductor I've been forced to use it though (which is definitely a good thing)

@thomasp85 thomasp85 closed this Jan 25, 2016
@thomasp85
Copy link
Member Author

... And ggraph is a week old so I think I'm entitled to doing the fun stuff for now :-)

@thomasp85
Copy link
Member Author

Reopening. fortify is called in layer() and doesn't have access to global data (as far as I can see). This PR is specifically meant for passing a function that is called on the global data... You had me confused with all this talk of fortify :-)

There will need to be some changes somewhere in ggplot2 for this to work. Should I redo the PR in the style of a get_data method to Layer as you proposed?

@thomasp85 thomasp85 reopened this Jan 25, 2016
@hadley
Copy link
Member

hadley commented Jan 25, 2016

Yeah - I'd recommend a new PR with:

  • fortify.function
  • Layer$get_data()
  • Some unit tests :)

@hadley hadley closed this Jan 25, 2016
@thomasp85
Copy link
Member Author

Having slept on it I have some reservations with the approach (no - not the unit tests:-). But it is ultimately up to you.

The promise of fortify is to turn data into a tabular format. By cheating and using it to let a function slip through we are making the use case of fortifymore confusing.

I kind feel that passing a function is a beefed up version of passing waiver and therefore it should be resolved at the same place (which was why I started out with map_layout (but this didn't work because it comes too late for facetting). It seems strange to have a Layer$get_data() that will return a waiver, but on the other hand it would potentially result in a performance hit to expand waiver data at the same time as function data.

I would personally find it most clear if function data was handled by a specialised function right after layer_data had been extracted (a simple lapply that checks whether elements are functions and if so calls the function on plot_data).

Again - it is ultimately up to you. Just some concerns over the clarity of the implementation.

@hadley
Copy link
Member

hadley commented Jan 26, 2016

I don't feel that strongly about the fortify issue. I do think it makes sense to have a function that returns the data from the layer, however it is stored (either directly, as a waiver, or via a function) - and layer_data <- lapply(layers, function(y) y$data) is crying out for more generality.

@thomasp85 thomasp85 mentioned this pull request Jan 28, 2016
@lock
Copy link

lock bot commented Jan 18, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Jan 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants