Skip to content

readCSV fails for *.zip #469

Closed
Closed
@koperagen

Description

@koperagen

This file extension is treated in a special way: there's a isCompressed method, and depending on it readCSV wraps InputStream. But it doesn't work for *.zip because InputStream is wrapped in a GZIPInputStream. Apparently it's also not enough to just wrap an InputStream, because ZIP has more complex structure and you need to call methods of ZipInputStream:

val zipInputStream = ZipInputStream(
    File("data.csv.zip").inputStream(),
    Charsets.UTF_8
)
zipInputStream.nextEntry
val df1 = DataFrame.readCSV(zipInputStream)
zipInputStream.closeEntry()

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcsvCSV / delim related issues

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions