Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine x and ... in min, max, range #1946

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

brookslogan
Copy link

Resolves #1372.

Only call vec_c(x, ...) if dots are nonempty, in order to stay fast in the empty case.

Only call `vec_c` if dots are nonempty to stay fast in the empty case.
@brookslogan
Copy link
Author

Sorry for the rewrite, fixed some commit metadata.

@brookslogan
Copy link
Author

brookslogan commented Aug 3, 2024

Some notes:

  • I think the main use case here is taking the min or max of two [or maybe more] size-1 vectors. If it's okay to deviate from mirroring base R behavior a bit, it would probably help to add some warnings or errors if ... is used in other contexts, such as when it seems like pmin or pmax might be desired instead. Happy to make these changes; please let me know if they belong.
  • The current check failure ("! Could not solve package dependencies:") doesn't seem like it's related to the changes in this PR.
  • There doesn't seem to be a notable performance impact with the current approach, but the empty-dots optimization does seem important.
Small-vctr benchmarks:
# main
x <- new_vctr(c(1,2,4)); bench::mark(min(x), min_iterations = 1e5, max_iterations = 1e5)
#> # A tibble: 1 × 13
#>   expression      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result    
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>    
#> 1 min(x)       50.6µs   55.9µs    17597.        NA     8.98 99949    51      5.68s <vctrs_vc>
#> # ℹ 3 more variables: memory <list>, time <list>, gc <list>

# this PR
x <- new_vctr(c(1,2,4)); bench::mark(min(x), min_iterations = 1e5, max_iterations = 1e5)
#> # A tibble: 1 × 13
#>   expression      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result    
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>    
#> 1 min(x)       53.3µs   55.9µs    17608.        NA     8.98 99949    51      5.68s <vctrs_vc>
#> # ℹ 3 more variables: memory <list>, time <list>, gc <list>

# Without if-empty-dots optimization:
x <- new_vctr(c(1,2,4)); bench::mark(min(x), min_iterations = 1e5, max_iterations = 1e5)
#> # A tibble: 1 × 13
#>   expression      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result    
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>    
#> 1 min(x)        187µs    207µs     4737.        NA     10.4 99780   220      21.1s <vctrs_vc>
#> # ℹ 3 more variables: memory <list>, time <list>, gc <list>

# Testing with short integer vectors (ALTREP seqs or not) also yielded comparable performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Additional arguments in min() and max() methods for vctrs_vctr
1 participant