Skip to content

subset.data.table return wrong cols when there are duplicated column names #891

@jjzz

Description

@jjzz
library(data.table)
# data.table 1.9.4  For help type: ?data.table
d=data.table(rep(3,3))
d=data.table(rep(1,3),rep(2,3), d)
d
#    V1 V2 V1
#1:  1  2  3
#2:  1  2  3
#3:  1  2  3
subset(d,T,c(3,2))
#    V1 V2
#1:  1  2
#2:  1  2
#3:  1  2

When using subset(d,T,c(3,2)), I want to retrieve the 3rd and 2nd columns. But subset() return the 1st and 2nd columns. Seems it's because of the duplicated column names V1 in data.table d.

I don't know the internal logic about how to handle duplicated col names. But I supposed that if the sequence id is supplied, then the col names (even if there are duplicated col names) should not bother, is it right?

Or maybe there should be some kind of warning when there are duplicated column names ?

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions