-
Notifications
You must be signed in to change notification settings - Fork 1k
Closed
Description
With automatic indexing turned on, if you have a data.table, DT
, with a character
column called col
and you use syntax like DT[col %in% list("A", "B")]
and there is no "A"
in col
, you will get no results even if "B"
is in col
. Here is an example:
require(data.table)
op <- options(datatable.auto.index=TRUE)
DT <- as.data.table(cars)
DT[speed %in% list(4, 1)] # This works because 4 is in `speed`
# speed dist
#1: 4 2
#2: 4 10
DT <- as.data.table(cars) # Incorrectly gives no results because 1 is not in `speed`
DT[speed %in% list(1, 4)]
#Empty data.table (0 rows) of 2 cols: speed,dist
If you turn automatic indexing off, you'll get the correct results
options(datatable.auto.index=FALSE)
DT[speed %in% list(1, 4)]
# speed dist
#1: 4 2
#2: 4 10
options(op)
p.s. I know it's awkward to use col %in% list()
instead of col %in% c()
p.p.s. The posting guidelines almost imply that I have the ability to add a label to my issue, but I don't think I do.
sessionInfo()
# R version 3.1.1 (2014-07-10)
# Platform: x86_64-pc-linux-gnu (64-bit)
#
# locale:
# [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C LC_COLLATE=C
# [5] LC_MONETARY=C LC_MESSAGES=C LC_PAPER=C LC_NAME=C
# [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C
#
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
#
# other attached packages:
# [1] data.table_1.9.5
#
# loaded via a namespace (and not attached):
# [1] chron_2.3-45 tools_3.1.1