-
Notifications
You must be signed in to change notification settings - Fork 982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
melt needs a variable.value parameter. #2065
Comments
Complete your set of columns and it'll work:
Sure, you still need to fill in the year values, but that's another issue that's already been filed, I guess. |
I've found that the melt step in my solution could be a big problem when working with medium-sized datatables. |
I guess you can get the years' column with
or also using data from your CJ. |
@franknarf1 , any direct way to get this?
instead of this: (your method)
without reordeing the result explicitily? |
Well, there's
However, doing this by group is a lot less efficient, probably. |
There are no plans to add new arguments to |
hi the missing input columns in this melt is similar to #4027 and is fixed by #4720 if you use the new function in nc package (which uses data.table internally), not merged into master yet but you can use it via: remotes::install_github(c("Rdatatable/data.table@fix4027", "tdhock/nc@multiple-fill"))
#> Skipping install of 'data.table' from a github remote, the SHA1 (4c5810c2) has not changed since last install.
#> Use `force = TRUE` to force installation
#> Skipping install of 'nc' from a github remote, the SHA1 (11b61f8e) has not changed since last install.
#> Use `force = TRUE` to force installation
library(data.table)
mydata <- data.table(ID=1:5, ZA_2001=1:5, ZA_2002=5:1, BB_2001=rep(3,5), BB_2002=rep(6,5), CC_2007=6:2)
(mydata.tall <- nc::capture_melt_multiple(
mydata,
column=".*",
"_",
year="[0-9]+", as.integer,
fill=TRUE))
#> ID year BB CC ZA
#> 1: 1 2001 3 NA 1
#> 2: 2 2001 3 NA 2
#> 3: 3 2001 3 NA 3
#> 4: 4 2001 3 NA 4
#> 5: 5 2001 3 NA 5
#> 6: 1 2002 6 NA 5
#> 7: 2 2002 6 NA 4
#> 8: 3 2002 6 NA 3
#> 9: 4 2002 6 NA 2
#> 10: 5 2002 6 NA 1
#> 11: 1 2007 NA 6 NA
#> 12: 2 2007 NA 5 NA
#> 13: 3 2007 NA 4 NA
#> 14: 4 2007 NA 3 NA
#> 15: 5 2007 NA 2 NA
mydata.tall[order(ID, year)]
#> ID year BB CC ZA
#> 1: 1 2001 3 NA 1
#> 2: 1 2002 6 NA 5
#> 3: 1 2007 NA 6 NA
#> 4: 2 2001 3 NA 2
#> 5: 2 2002 6 NA 4
#> 6: 2 2007 NA 5 NA
#> 7: 3 2001 3 NA 3
#> 8: 3 2002 6 NA 3
#> 9: 3 2007 NA 4 NA
#> 10: 4 2001 3 NA 4
#> 11: 4 2002 6 NA 2
#> 12: 4 2007 NA 3 NA
#> 13: 5 2001 3 NA 5
#> 14: 5 2002 6 NA 1
#> 15: 5 2007 NA 2 NA |
here is another solution using only data.table::melt, remotes::install_github("Rdatatable/data.table@fix4027")
#> Skipping install of 'data.table' from a github remote, the SHA1 (4c5810c2) has not changed since last install.
#> Use `force = TRUE` to force installation
library(data.table)
mydata <- data.table(ID=1:5, ZA_2001=1:5, ZA_2002=5:1, BB_2001=rep(3,5), BB_2002=rep(6,5), CC_2007=6:2)
(mydata.tall <- melt(mydata, measure.vars=list(
ZA=c("ZA_2001", "ZA_2002", NA),
BB=c("BB_2001", "BB_2002", NA),
CC=c(NA, NA, "CC_2007"))))
#> ID variable ZA BB CC
#> 1: 1 1 1 3 NA
#> 2: 2 1 2 3 NA
#> 3: 3 1 3 3 NA
#> 4: 4 1 4 3 NA
#> 5: 5 1 5 3 NA
#> 6: 1 2 5 6 NA
#> 7: 2 2 4 6 NA
#> 8: 3 2 3 6 NA
#> 9: 4 2 2 6 NA
#> 10: 5 2 1 6 NA
#> 11: 1 3 NA NA 6
#> 12: 2 3 NA NA 5
#> 13: 3 3 NA NA 4
#> 14: 4 3 NA NA 3
#> 15: 5 3 NA NA 2
mydata.tall[, measure := c(2001, 2002, 2007)[variable] ]
mydata.tall[order(ID, measure), .(
ID, measure, ZA, BB, CC)]
#> ID measure ZA BB CC
#> 1: 1 2001 1 3 NA
#> 2: 1 2002 5 6 NA
#> 3: 1 2007 NA NA 6
#> 4: 2 2001 2 3 NA
#> 5: 2 2002 4 6 NA
#> 6: 2 2007 NA NA 5
#> 7: 3 2001 3 3 NA
#> 8: 3 2002 3 6 NA
#> 9: 3 2007 NA NA 4
#> 10: 4 2001 4 3 NA
#> 11: 4 2002 2 6 NA
#> 12: 4 2007 NA NA 3
#> 13: 5 2001 5 3 NA
#> 14: 5 2002 1 6 NA
#> 15: 5 2007 NA NA 2 |
pure data.table solution using #4731 remotes::install_github("Rdatatable/data.table@melt-custom-variable")
#> Skipping install of 'data.table' from a github remote, the SHA1 (c02fa9e8) has not changed since last install.
#> Use `force = TRUE` to force installation
library(data.table)
mydata <- data.table(ID=1:5, ZA_2001=1:5, ZA_2002=5:1, BB_2001=rep(3,5), BB_2002=rep(6,5), CC_2007=6:2)
melt(mydata, measure.vars=measure(value.name, year))
#> ID year ZA BB CC
#> 1: 1 2001 1 3 NA
#> 2: 2 2001 2 3 NA
#> 3: 3 2001 3 3 NA
#> 4: 4 2001 4 3 NA
#> 5: 5 2001 5 3 NA
#> 6: 1 2002 5 6 NA
#> 7: 2 2002 4 6 NA
#> 8: 3 2002 3 6 NA
#> 9: 4 2002 2 6 NA
#> 10: 5 2002 1 6 NA
#> 11: 1 2007 NA NA 6
#> 12: 2 2007 NA NA 5
#> 13: 3 2007 NA NA 4
#> 14: 4 2007 NA NA 3
#> 15: 5 2007 NA NA 2 |
In order to reshape data.tables such as this one:
to this one
and keep the original variables' order I need to do:
But it needs too many lines of code, a lot of memory, and it forces type conversions.
It would be great if we could just do it using melt:
(or using regex instead).
But it doesn't work, it only produces 10 rows, and it would need an option to specify where to get the year from.
something like
variable.values=sub(".*_","",names(mydata) )[-1]
or even better a syntax such as the splitstackshape package:
Reshape(mydata,id.vars="ID",var.stubs =... ,sep="_")
it would be great.
The text was updated successfully, but these errors were encountered: