Closed
Description
openedon Apr 9, 2021
It would be great if haven
could directly use R's tagged (and labelled) NAs like tagged_na('a')
as the equivalent of Stata missing codes (e.g. .a
)
So far, this happens:
library(haven)
path <- tempfile()
tags <- tagged_na('a', 'b')
x1 <- labelled(
c(1, 2, 1, tags[1], tags[2]),
c("Did not know" = tags[1], "Refused to answer" = tags[2])
)
example <- data.frame(x1)
write_dta(example, path)
str(read_dta(path)$x1)
# dbl+lbl [1:5] 1, 2, 1, NA, NA
# @ format.stata: chr "%10.0g"
# @ labels : Named num [1:2] -2.15e+09 -2.15e+09
# ..- attr(*, "names")= chr [1:2] "Did not know" "Refused to answer"
The Stata file created by R interprets the missing codes as simple missings: .
(the equivalent of NA
in R)
. codebook x1
---------------------------------------------------------------------------------------------------
x1 (unlabeled)
---------------------------------------------------------------------------------------------------
type: numeric (double)
label: x1, but 2 nonmissing values are not labeled
range: [1,2] units: 1
unique values: 2 missing .: 2/5
tabulation: Freq. Numeric Label
2 1
1 2
2 .
I do not know if readstat
supports this, perhaps @evanmiller could confirm. Either haven
and/or readstat
, this would be an epic feature. No other conversion software can manage that, as far as I am aware.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Metadata
Assignees
Labels
No labels