Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[[ by group performance #4655

Merged
merged 10 commits into from
Sep 24, 2020
Merged

[[ by group performance #4655

merged 10 commits into from
Sep 24, 2020

Conversation

ColeMiller1
Copy link
Contributor

@ColeMiller1 ColeMiller1 commented Aug 1, 2020

Closes #4646
Closes #4658
Closes #4424

library(data.table)

set.seed(123L)
n = 500L
n_nested = 40L
dt = data.table(ordered_id = seq_len(n),
                unordered_id = sample(n),
                value = replicate(n, data.table(val1 = sample(n_nested)), simplify = FALSE))

bench::mark(
    dt[, value[[1L]], by = ordered_id]
    , dt[, value[[1L]], by = unordered_id]
    , check = FALSE
    , time_unit = "ms" 
)

## pre-regression (1.12.8)

##  expression                             min median `itr/sec` mem_alloc
##  <bch:expr>                           <dbl>  <dbl>     <dbl> <bch:byt>
##1 dt[, value[[1L]], by = ordered_id]   0.755  0.838     1131.     401KB
##2 dt[, value[[1L]], by = unordered_id] 0.796  0.892     1033.     409KB

## Post-regression (CRAN 1.13.0)

#> # A tibble: 2 x 6
#>   expression                               min  median `itr/sec` mem_alloc
#>   <bch:expr>                             <dbl>   <dbl>     <dbl> <bch:byt>
#> 1 dt[, value[[1L]], by = ordered_id]   102.    126.         8.02    53.7MB
#> 2 dt[, value[[1L]], by = unordered_id]   0.630   0.669   1261.     409.2KB
#> # ... with 1 more variable: `gc/sec` <dbl>

## after patch

##  expression                             min median `itr/sec` mem_alloc `gc/sec`
##  <bch:expr>                           <dbl>  <dbl>     <dbl> <bch:byt>    <dbl>
##1 dt[, value[[1L]], by = ordered_id]   0.634  0.691     1358.     401KB     15.0
##2 dt[, value[[1L]], by = unordered_id] 0.651  0.778     1187.     409KB     12.8

The root cause appears to be in memrecycle which is in assign.c. I can get similar performance by commenting out the duplicate line but then there are many errors during the testing. I'm including this in case someone is able to move forward with fixing the root cause.

data.table/src/assign.c

Lines 707 to 723 in 9d3b920

if (isNewList(source)) {
// A list() column; i.e. target is a column of pointers to SEXPs rather than the more common case of numbers in an atomic vector.
// If any item within the list is NAMED then take a fresh copy. So far this has occurred from dogroups.c when
// j returns .BY or similar specials as-is within a list(). Those specials are static inside
// dogroups so if we don't copy now the last value written to them by dogroups becomes repeated in the result;
// i.e. the wrong result.
// If source is itself recycled later (many list() column items pointing to the same object) we are ok with that
// since we now have a fresh copy and := will not assign with a list() column's cell value; := only changes the
// SEXP pointed to.
// If source is already not named (because j already created a fresh unnamed vector within a list()) we don't want to
// duplicate unnecessarily, hence checking for named rather than duplicating always.
// See #481, #1270 and tests 1341.* fail without this copy.
// ********** This might go away now that we copy properly in dogroups.c **********
if (anyNamed(source)) {
source = PROTECT(copyAsPlain(source)); protecti++;
}
}

@ColeMiller1

This comment has been minimized.

@mattdowle mattdowle added this to the 1.13.1 milestone Aug 4, 2020
@codecov
Copy link

codecov bot commented Aug 4, 2020

Codecov Report

Merging #4655 into master will decrease coverage by 0.17%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4655      +/-   ##
==========================================
- Coverage   99.61%   99.44%   -0.18%     
==========================================
  Files          73       73              
  Lines       14250    14536     +286     
==========================================
+ Hits        14195    14455     +260     
- Misses         55       81      +26     
Impacted Files Coverage Δ
src/assign.c 99.70% <ø> (-0.15%) ⬇️
src/dogroups.c 99.66% <100.00%> (-0.34%) ⬇️
src/fastmean.c 96.82% <0.00%> (-3.18%) ⬇️
src/utils.c 96.03% <0.00%> (-2.10%) ⬇️
src/uniqlist.c 98.25% <0.00%> (-1.75%) ⬇️
src/between.c 99.21% <0.00%> (-0.79%) ⬇️
src/subset.c 99.50% <0.00%> (-0.50%) ⬇️
src/frollR.c 99.53% <0.00%> (-0.47%) ⬇️
src/forder.c 99.61% <0.00%> (-0.39%) ⬇️
src/gsumm.c 99.61% <0.00%> (-0.39%) ⬇️
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update be6c1fc...ec041f7. Read the comment docs.

@mattdowle
Copy link
Member

mattdowle commented Sep 24, 2020

I followed Cole's first inclination; i.e. removed the copy from memrecycle as he did which seems right to me too.
That fixes the slowdown problem in subsetting but leaves us with a different problem to solve in a different way. All the test fails were to do with static vectors not being copied which is why that copy was there in memrecycle. So I moved that copy into dogroups.c to be close to where it's relevant. dogroups now uses negative truelength to mark the specials (.SD, .BY, .I). Just before assigning jval to a list column, it checks to see if any specials are present and copies them if so. Although it's still relatively complex, it's cleaner in the sense that the complexity is now focused and localized to dogroups.c. This should avoid unexpected effects like the slowdown occurring due to the copy in the central place (memrecycle) which is now gone.
Extensive comments and example added to new anySpecialStatic() at the top of dogroups.c which replaces the anyNamed() in assign.c.
Other tidy ups:
Removed origSDnrow as it was equal to maxGrpSize, adding a check that that's true.
origIlen could also be removed because I was allocated with maxGrpSize anyway.

@jangorecki
Copy link
Member

this will also resolve #4424

@jangorecki jangorecki linked an issue Sep 24, 2020 that may be closed by this pull request
@mattdowle
Copy link
Member

mattdowle commented Sep 24, 2020

You mean .NGRP? That's constant across groups so doesn't need to be copied. But tested anyway (2153.3) and referred to in comment in anySpecialStatic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants