You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[BUGZILLA #17890] transform(df, check.names=FALSE, ...) mangles names of df if it replaces a column of df. If it only adds columns it does not mangle df's names.
#7063
Open
github-actionsbot opened this issue
Aug 21, 2020
· 4 comments
Actually, it is only by accident that check.names does anything in a transform() call - it is not a documented argument, so only "works" because transform() internally calls data.frame().
This in turn means that certain variable names are not usable (check.names, fix.empty.names, etc.) because they have a special meaning.
There is an internal split between "new" and "old" variables, specifically
if (any(matched)) {
`_data`[inx[matched]] <- e[matched]
`_data` <- data.frame(`_data`)
}
and the call to data.frame() here is not going to see check.names= so will clean up names.
This is very old code and data frame semantics have changed considerably in the meantime. In particular, I don't know whether the call to data.frame is still needed. The code suggests that the previous assignment might have turned _data into something that wasn't a data frame. Similarly, the split between matched and non-matched computed values might not be required anymore, since it now actually works to do things like
It might be worth rewriting transform() at some point, but it is an old tool, mainly used interactively, and you are probably better off using within():
Yes, within() is nicer about names than transform(). We noticed this issue because at least one CRAN package relied on passing check.names=FALSE to transform:
I don't think the effect of check.names=FALSE should depend on whether transform() replaces a column in or just adds columns to the data.frame.
df <- data.frame(check.names=FALSE,
A-1
=11:12, B=21:22)names(df)
#[1] "A-1" "B"
names(transform(df, check.names=FALSE, B=B+5,
C-3
=21:22))#[1] "A.1" "B" "C-3"
names(transform(df, check.names=FALSE,
C-3
=21:22))#[1] "A-1" "B" "C-3"
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_4.0.2
METADATA
The text was updated successfully, but these errors were encountered: