Skip to content

read_spss and write_spss: missing labels for strings #409

Closed

Description

When importing string variables via read_spss missing labels (and sometimes also value labels) seem to behave strangely depending on the width of the variables (at least in educational large scale assessments missing and value labels for string variables are not that uncommon). I used the current GitHub version of haven.

The sav-file I attached looks like this (using SPSS 22.0.0.1 or SPSS 25)

test1.zip

spss_variables
spss_data

Importing results in the following attributes on variable level:

rawDat <- haven::read_spss(file = "N:/spss/test1.sav", user_na = TRUE)
lapply(rawDat, attributes)
#> $v1
#> $v1$na_values
#> [1] 99
#> 
#> $v1$class
#> [1] "haven_labelled_spss" "haven_labelled"     
#> 
#> $v1$format.spss
#> [1] "F8.2"
#> 
#> $v1$labels
#> one 
#>   1 
#> 
#> 
#> $v2
#> $v2$na_values
#> [1] NA
#> 
#> $v2$class
#> [1] "haven_labelled_spss" "haven_labelled"     
#> 
#> $v2$format.spss
#> [1] "A8"
#> 
#> $v2$labels
#> one 
#> "1" 
#> 
#> 
#> $v3
#> $v3$format.spss
#> [1] "A9"
#> 
#> $v3$class
#> [1] "haven_labelled"
#> 
#> $v3$labels
#> one 
#> "1" 
#> 
#> 
#> $v4
#> $v4$format.spss
#> [1] "A21"

Created on 2018-10-01 by the reprex package (v0.2.1)

When writing to sav missing labels for string variables are also dropped:

# set up data frame
df <- data.frame(v1 = c(1, 99), v2 = c("aa", "99"), stringsAsFactors = FALSE)
attributes(df$v1) <- list(na_values = 99, class = c("haven_labelled_spss", "haven_labelled"), format.spss = "F8.2", labels = c(one = 1))
attributes(df$v2) <- list(na_values = "99", class = c("haven_labelled_spss", "haven_labelled"), format.spss = "A2", labels = c(sth = "aa"))
# write sav
haven::write_sav(df, path = "N:/spss/test2.sav")
# read sav
spssDF <- haven::read_spss(file = "N:/spss/test2.sav", user_na = TRUE)
lapply(spssDF, attributes)
#> $v1
#> $v1$na_values
#> [1] 99
#> 
#> $v1$class
#> [1] "haven_labelled_spss" "haven_labelled"     
#> 
#> $v1$format.spss
#> [1] "F8.2"
#> 
#> $v1$labels
#> one 
#>   1 
#> 
#> 
#> $v2
#> $v2$format.spss
#> [1] "A2"
#> 
#> $v2$class
#> [1] "haven_labelled"
#> 
#> $v2$labels
#>  sth 
#> "aa"

Created on 2018-10-01 by the reprex package (v0.2.1)

And the spss variable view looks like this:

spss_test2

Is there any way to import missing and value labels consistently from sav files to R?

Thank You!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions