-
Notifications
You must be signed in to change notification settings - Fork 1k
Closed
Labels
HighbugjoinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joins
Milestone
Description
Please consider the following:
> dt <- data.table(id=rep(letters[1:2], 2), var = rnorm(4), key="id")
> dt
id var
1: a 0.9609685
2: a 0.1432707
3: b 1.1276582
4: b 0.8051821
> dt[letters[1:3], list(var)]
Error in vecseq(f__, len__, if (allow.cartesian) NULL else as.integer(max(nrow(x), :
Join results in 5 rows; more than 4 = max(nrow(x),nrow(i)). Check for duplicate key values in i, each of which join to the same group in x over and over again. [...]
> dt[letters[1:3], list(var), by=.EACHI]
id var
1: a 0.9609685
2: a 0.1432707
3: b 1.1276582
4: b 0.8051821
5: c NA
The second join results in 5 rows too, shouldn't both joins above be consistent? (Maybe both like the second)
I also wander, the concept behind the implementation of allow.cartesian is simply 1) output rows has not to be more than max(nrow(x),nrow(i)) or 2) to avoid duplicates in key values of i?
> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
# data.table installed today from github
Metadata
Metadata
Assignees
Labels
HighbugjoinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joins