-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rcopy handles formular symbols #524
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #524 +/- ##
==========================================
- Coverage 80.40% 77.87% -2.53%
==========================================
Files 26 26
Lines 1684 1686 +2
==========================================
- Hits 1354 1313 -41
- Misses 330 373 +43 ☔ View full report in Codecov by Sentry. |
@ararslan thoughts on the behavior here? Essentially it transforms all non-standard evaluation in R (here: implicit conversion of a bare symbol into an expression whose value is that symbol) into a Julia expression. I have the distinct feeling that there are some edge cases where this behavior could be quite wonky. @schlichtanders if @ararslan is onboard, we would still need tests of this behaviour. ❤️ |
@@ -38,6 +38,11 @@ end | |||
|
|||
function rcopy(::Type{FormulaTerm}, l::Ptr{LangSxp}) | |||
expr = rcopy(Expr, l) | |||
# special case of a simple variable, like in aes(x, y) | |||
if Meta.isexpr(expr, :call) && length(expr.args) == 2 && expr.args[1] == :~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means expr
is Expr(:call, :~, x)
, i.e. :(~x)
, for some x
. So this is keeping one-sided formulas from being passed to @formula
, which can't handle one-sided formulas anyway. What I don't understand though is the accompanying comment here: "special case of a simple variable, like in aes(x, y)." I don't see how that example applies to this situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was the concrete example which fails for me in current RCall. Having an expression like aes(x, y)
runs into this formula problem that simple symbols like x
and y
in this case are not yet supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in this example, aes(x, y)
appears inside of an R formula? Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more like aes(x, y)
is "lowered" to aes_(~x, ~y)
via non standard evaluation because R uses one sided formulas to represent symbols/expressions. (The lowering rule is part of the function definition in ggplot, so it's maybe better described as an implicit macro.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to be honest, I don't actually know how RCall works internally. Is the following more or less accurate for this situation? The R surface syntax is aes(x, y)
which gets evaluated in R as aes_(~x, ~y)
, so by the time this expression gets to Julia, Julia sees Expr(:call, :aes_, Expr(:call, :~, :x), Expr(:call, :~, :y))
and tries to turn it into aes_(::FormulaTerm, ::FormulaTerm)
, but that fails because FormulaTerm
doesn't support one-sided formulas. Assuming that's accurate, this change seems safe to me and is arguably a bugfix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this PR, the behavior is now
julia> using RCall
R> install.packages("^C
R> library("ggplot2")
julia> R"aes(x, y"^C
R> df <- data.frame(x=1,y=1)
julia> R"~x"
RObject{LangSxp}
~x
julia> rcopy(R"~x")
:(~x)
julia> rcopy(R"~y")
:(~y)
julia> rcopy(R"aes(df, x, y)")
OrderedCollections.OrderedDict{Symbol, Any} with 3 entries:
:x => :(~df)
:y => :(~x)
Symbol("") => :(~y)
julia> R"aes(df, x, y)"
RObject{VecSxp}
Aesthetic mapping:
* `x` -> `df`
* `y` -> `x`
* `` -> `y`
julia> rcopy(R"y~x")
FormulaTerm
Response:
y(unknown)
Predictors:
x(unknown)
julia> rcopy(R"~x+y")
:(~(x + y))
So this PR converts a one-sided formula into a symbol representation of that formula. @ararslan do you have an opinion on whether it would be better to strip the tilde? I'm a bit conflicted myself and could argue for either direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any examples in mind of when stripping the tilde would not be desirable? If not then stripping it seems sensible to me, but otherwise would there still be a way to recover the original behavior in those cases after this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if there are cases where you would want to preserve a raw one-sided formula as an explicit one-sided formula. I can't think of any:
- you usually don't want to evaluate in R something already copied into Julia without thinking about how to convert it back. In other words, preserving the original literal expression isn't really valuable
- if you want to create the closest Julia analog to a one-sided formula (either 0 or the left-hand side or a
MatrixTerm
), then the tilde needs to be stripped to programmatically manipulate the symbols anyway.
Going the other way: the tilde in one-sided formulae in R is generally analogous to Expr
in Julia, so it seems having it be Expr
in Julia eliminates the need for the tilde.
I guess we can solve the dilemma by adding a note to the documentation that one-sided formulae in R are converted to Expr
in Julia without a tilde as a representation of unevaluated code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate the explanation, that all sounds good to me. I think the documentation aspect will be particularly important in case anybody ends up surprised by this behavior for whatever reason, however unlikely. Consider me on board with this change!
@schlichtanders tests for this can go in https://github.com/JuliaInterop/RCall.jl/blob/master/test/convert/formula.jl. You can just use my examples from the discussion thread as tests. 😄 If you want to include the ggplot2 bit, then you should do something like
before the any tests depending on ggplot2. This will only install ggplot2 if it's not already present on the system/ |
@@ -38,6 +38,11 @@ end | |||
|
|||
function rcopy(::Type{FormulaTerm}, l::Ptr{LangSxp}) | |||
expr = rcopy(Expr, l) | |||
# special case of a simple variable, like in aes(x, y) | |||
if Meta.isexpr(expr, :call) && length(expr.args) == 2 && expr.args[1] == :~ | |||
return expr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return expr | |
return expr.args[2] |
(so that we drop the tilde)
if Meta.isexpr(expr, :call) && length(expr.args) == 2 && expr.args[1] == :~ | ||
return expr | ||
end | ||
# complex formular |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# complex formular | |
# complex formula |
@@ -38,6 +38,11 @@ end | |||
|
|||
function rcopy(::Type{FormulaTerm}, l::Ptr{LangSxp}) | |||
expr = rcopy(Expr, l) | |||
# special case of a simple variable, like in aes(x, y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# special case of a simple variable, like in aes(x, y) | |
# special case of a non-standard evaluation / a "bare" expression / (implicit) one-sided formula in R, like in `aes(x, y)` |
fixes #507