Description
Problem Whereas ggplot2 supports data.frame
, many other data structures are available that could benefit from the ability to use ggplot2 functionality. Examples include e.g. DataFrame, matrix, dgCMatrix, DelayedMatrix, SparseMatrix, etc. Many of these classes support as.data.frame()
and can be easily converted into a data.frame. However, the need to do this with every ggplot2 function call becomes rapidly very repetitive.
Suggested solution The default fortify() method, ggplot2:::fortify.default()
could just try to call as.data.frame()
on the supplied object. This would directly make ggplot()
work on any object that supports as.data.frame()
(e.g. DataFrame, matrix, dgCMatrix, DelayedMatrix, SparseMatrix, etc.)
Let's load libraries and example data
library(S4Vectors)
library(ggplot2)
data(iris)
Usual data.frame works as expected:
ggplot(iris, aes(x=Sepal.Width, y=Sepal.Length)) + geom_point()
DataFrame does not work, and ggplot call throws and error:
ggplot(DataFrame(iris), aes(x=Sepal.Width, y=Sepal.Length)) + geom_point()
Error in
fortify()
:
!data
must be a <data.frame>, or an object coercible byfortify()
,
not a object.
At the moment our default solution has been to always add as.data.frame()
around DataFrame objects, like:
ggplot(as.data.frame(d), aes(x, y)) + geom_point()
There was initial discussion that related to the challenges this adds to teaching standard plotting in ecosystems that rely on classes that are closely related to data.frame but not that.
Initial thought was to solve this in the S4Vectors class (for DataFrame), see the PR by @kevinrue - then @hpages pointed out the more general solution described above.
-> Could ggplot add the as.data.frame
check to extend the support to other formats than data.frame
? If yes, we might be able to provide a PR.