Skip to content

Importing from Stata: Encoding issues #71

Closed

Description

It seems that there are some encoding issues using read_dta:
I'm working with version 0.2.0.9000. If there are non-ASCII-characters, like german Umlauts (äöüß), they seem to break. It looks like they were treated as UTF-8 by the import-function, but they are actually something like Latin-1 (or similar). I can correct the string by using

iconv(names(attr(*, "labels")), from="L1", to="UTF-8")

PS: With Stata 14 Unicode features have been introduced for the first time: http://www.stata.com/stata14/unicode/
PPS: I can provide a Stata 13 file containing Umlauts. (Stata 14 is not yet available for me.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions