Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upsampling a data.frame with dtplyr fails #403

Open
johnF-moore opened this issue Dec 8, 2022 · 1 comment
Open

Upsampling a data.frame with dtplyr fails #403

johnF-moore opened this issue Dec 8, 2022 · 1 comment
Labels
bug an unexpected problem or unintended behavior

Comments

@johnF-moore
Copy link

I want to upsample a data.frame with replacement to go from say 10 rows to 15 rows. I can do this easily with dplyr::slice_sample() and data.table; however, dtplyr does not return a data.frame that is larger than the initial input. It returns 10 rows instead of 15.

Thus, adding lazy_dt() to my dplyr workflow returned a different result.

library(dplyr, warn.conflicts = FALSE)
library(dtplyr, warn.conflicts = FALSE)
library(data.table, warn.conflicts = FALSE)

small_iris <- head(iris, n = 10) 
sample_size <- 15

## Upsampling with replacement using data.table works
upsampled_dt <- as.data.table(small_iris)[sample(.N, 
                                                 sample_size, 
                                                 replace = TRUE)]
nrow(upsampled_dt)
#> [1] 15

## Upsampling with replacement using dplyr works
upsampled_dplyr <- small_iris %>% 
  slice_sample(n = sample_size,
               replace = TRUE) 
nrow(upsampled_dplyr)
#> [1] 15

## Upsampling with replacement using dtplyr fails
upsampled_dtplyr <- small_iris %>% 
  lazy_dt() %>% 
  slice_sample(n = sample_size, 
               replace = TRUE) %>% 
  as_tibble()
nrow(upsampled_dtplyr)
#> [1] 10

Am I missing anything? I'm using R-4.0.3 on Ubuntu, and the dev version of dtplyr.

@eutwt
Copy link
Collaborator

eutwt commented Dec 12, 2022

Thanks for the report! This is a bug.

@markfairbanks markfairbanks added the bug an unexpected problem or unintended behavior label Dec 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

3 participants