Skip to content

[SPARK-9316] [SPARKR] Add support for filtering using [ (synonym for filter / select) #8394

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

felixcheung
Copy link
Member

Add support for

   df[df$name == "Smith", c(1,2)]
   df[df$age %in% c(19, 30), 1:2]

@shivaram

@shivaram
Copy link
Contributor

Jenkins, ok to test

@shivaram
Copy link
Contributor

cc @sun-rui @falaki

@SparkQA
Copy link

SparkQA commented Aug 24, 2015

Test build #41458 has finished for PR 8394 at commit 16e0ba3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -945,6 +947,19 @@ setMethod("[", signature(x = "DataFrame", i = "missing"),
select(x, j)
})

#' @rdname select
setMethod("[", signature(x = "DataFrame", i = "Column"),
function(x, i, j, ...) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... seems not necessary? It allows user pass parameters that won't be used, which may be confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to ...?
You are probably right... I'm more or less following line 938 above - since this is a i-not-missing overload of the operator/method.

 setMethod("[", signature(x = "DataFrame", i = "missing"),
           function(x, i, j, ...) {

setMethod("[", signature(x = "DataFrame", i = "Column"),
          function(x, i, j, ...) {

Perhaps we should take both out? But that would be a breaking change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we need the ... to match the signature of [ in base R. From the help page you can see that the signatures look like

     x[i]
     x[i, j, ... , drop = TRUE]
     x[[i, exact = TRUE]]
     x[[i, j, ..., exact = TRUE]]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also FYI the Matrix package which also uses S4 is a good example for things like this
https://github.com/rforge/matrix/blob/master/pkg/Matrix/R/Matrix.R#L457

@sun-rui
Copy link
Contributor

sun-rui commented Aug 25, 2015

LGTM. some minor comments.

@SparkQA
Copy link

SparkQA commented Aug 25, 2015

Test build #41517 has finished for PR 8394 at commit 3578ba2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

Thanks @felixcheung -- this LGTM to me as well

@SparkQA
Copy link

SparkQA commented Aug 26, 2015

Test build #41599 has finished for PR 8394 at commit 0fde873.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

Thanks @felixcheung -- Merging this

asfgit pushed a commit that referenced this pull request Aug 26, 2015
…r filter / select)

Add support for
```
   df[df$name == "Smith", c(1,2)]
   df[df$age %in% c(19, 30), 1:2]
```

shivaram

Author: felixcheung <felixcheung_m@hotmail.com>

Closes #8394 from felixcheung/rsubset.

(cherry picked from commit 75d4773)
Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
@asfgit asfgit closed this in 75d4773 Aug 26, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants