Skip to content

Access to layer information during build stages #3175

Closed
@thomasp85

Description

@thomasp85

I'm opening this issue as a more general discussion ground for the discussions going on in #3170 (which fixes #3116). The reason for this is that in my attempt to implement #3062 I've run into more or less the same issues but from the facet side, and I think the time is up for dealing without it in a more general way.

The problem: Throughout ggplot2's code base the layer data is the sole representative of the layer. It is not assumed that the layer have any other information of interest to share. This makes for clean code. We don't need to pass the layer objects around so we get nice separation of responsibility. But it also means that the layer cannot tell the outside world about any special property it has. Generally this has not been a problem as any specifics has been relevant to the geom and stat exclusively (part of the internal of the layer).

One solution: Take the approach given in #3170 and generalise it, so that any function receiving layer data also receives a layer_param object, as well as formalising setting up this object (probably within setup_layer()). This will require sweeping changes to many methods across many classes (internally and in extension packages) so It's certainly less attractive because of that. On the other hand it will result in an architecture that is consistent (using a params object). We may be able to modify the ggproto dispatch to quietly ignore unknown arguments in order to avoid breaking extension packages but it would remove the safety of early failure when misspelling arguments in code.

Another solution: Attenuate the data object with layer param attributes - this would allow everything to continue to work as it has done before, but will require internal changes to make sure the data keeps the information as it is passed around. A layer_param attribute could keep all relevant information that we would otherwise pass around as a layer_param argument. This will undoubtedly result in some more ugly code though helper functions could probably tidy most of it up. Worst is probably that we now divert into two different ways of passing around information between objects.

A kind of middle ground: Each layer data could also simply get an attribute (or an additional column) telling the index of the layer in the layer stack, and e.g. the facets could seep out layer information from all the layers during setup and then take care of matching that information with the data itself if needed. I'm not sure there's any benefit to this other than avoiding to put arbitrary amount of information in the data attributes. It would certainly result in even more housekeeping code than option two

I'm sure there are solutions I haven't considered so I'd be happy to take a broad discussion about how we want to solve it. I'm certain this is something we want to solve in general in order for development to move forward. (as an aside (can't find the issues right now), it will also allow for the layers to take additional params needed by extensions without triggering the unknown argument warning—I remember ggplotly requesting this at least.

@hadley, @karawoo, @clauswilke, @yutannihilation looking forward to your input

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions