Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUGZILLA #17890] transform(df, check.names=FALSE, ...) mangles names of df if it replaces a column of df. If it only adds columns it does not mangle df's names. #7063

Open
github-actions bot opened this issue Aug 21, 2020 · 4 comments

Comments

@github-actions
Copy link

I don't think the effect of check.names=FALSE should depend on whether transform() replaces a column in or just adds columns to the data.frame.

df <- data.frame(check.names=FALSE, A-1=11:12, B=21:22)
names(df)
#[1] "A-1" "B"
names(transform(df, check.names=FALSE, B=B+5, C-3=21:22))
#[1] "A.1" "B" "C-3"
names(transform(df, check.names=FALSE, C-3=21:22))
#[1] "A-1" "B" "C-3"

sessionInfo()

R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 16299)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] compiler_4.0.2


METADATA

  • Bug author - Bill Dunlap
  • Creation time - 2020-08-20 17:21:09 UTC
  • Bugzilla link
  • Status - UNCONFIRMED
  • Alias - None
  • Component - Analyses
  • Version - R 4.0.x
  • Hardware - All All
  • Importance - P5 normal
  • Assignee - R-core
  • URL -
@github-actions
Copy link
Author

Actually, it is only by accident that check.names does anything in a transform() call - it is not a documented argument, so only "works" because transform() internally calls data.frame().

This in turn means that certain variable names are not usable (check.names, fix.empty.names, etc.) because they have a special meaning.

There is an internal split between "new" and "old" variables, specifically

if (any(matched)) {
    `_data`[inx[matched]] <- e[matched]
    `_data` <- data.frame(`_data`)
}

and the call to data.frame() here is not going to see check.names= so will clean up names.

This is very old code and data frame semantics have changed considerably in the meantime. In particular, I don't know whether the call to data.frame is still needed. The code suggests that the previous assignment might have turned _data into something that wasn't a data frame. Similarly, the split between matched and non-matched computed values might not be required anymore, since it now actually works to do things like

airquality[c("Ozone", "foo", "bar")] <- list(Ozone=log(airquality$Ozone), foo=1, bar=2)

with recycling and all.

It might be worth rewriting transform() at some point, but it is an old tool, mainly used interactively, and you are probably better off using within():

names(within(df, {B=B+5; `C-3`=21:22}))

[1] "A-1" "B" "C-3"


METADATA

  • Comment author - Peter Dalgaard
  • Timestamp - 2020-08-21 10:07:38 UTC

@github-actions
Copy link
Author

NA


METADATA

  • Comment author - elin.waring
  • Timestamp - 2020-08-21 13:01:04 UTC

@github-actions
Copy link
Author

Yes, within() is nicer about names than transform(). We noticed this issue because at least one CRAN package relied on passing check.names=FALSE to transform:

./rapportools/R/tables.R: tbl <- transform(tbl, % = N / base::sum(N) * 100, check.names = FALSE) # add percentage
./rapportools/R/tables.R: tbl <- transform(tbl, Cumul. N = cumsum(N), Cumul. % = cumsum(%), check.names = FALSE) # add cumulatives


METADATA

  • Comment author - Bill Dunlap
  • Timestamp - 2020-08-21 16:41:50 UTC

@github-actions
Copy link
Author

NA


METADATA

  • Comment author - elin.waring
  • Timestamp - 2020-08-21 18:36:56 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

0 participants