Open
Description
openedon Nov 10, 2020
I am working with a series of large Parquet files (which I cannot share), and there seems to be a weird error when reading them:
julia> DataFrame(load("current.parquet"))
ERROR: UndefRefError: access to undefined reference
The stacktrace is as follows:
Stacktrace:
[1] getproperty(::ParquetFiles.RCType276, ::Symbol) at ./Base.jl:33
[2] macro expansion at /home/tpoisot/.julia/packages/ParquetFiles/cLLFb/src/ParquetFiles.jl:48 [inlined]
[3] iterate(::ParquetFiles.ParquetNamedTupleIterator{NamedTuple{(:column, :names, :go, :here),Tuple{String,Int32,String,String,Int32,Int32,Int32,Int32,Int32}},ParquetFiles.RCType276}, ::Int64) at /home/tpoisot/.julia/packages/ParquetFiles/cLLFb/src/ParquetFiles.jl:39
[4] iterate at /home/tpoisot/.julia/packages/Tables/xHhzi/src/tofromdatavalues.jl:53 [inlined]
[5] iterate at ./iterators.jl:139 [inlined]
[6] buildcolumns(::Tables.Schema{(:column, :names, :go, :here),Tuple{String,Int32,String,String,Int32,Int32,Int32,Int32,Int32}}, ::Tables.IteratorWrapper{ParquetFiles.ParquetNamedTupleIterator{NamedTuple{(:column, :names, :go, :here),Tuple{String,Int32,String,String,Int32,Int32,Int32,Int32,Int32}},ParquetFiles.RCType276}}) at /home/tpoisot/.julia/packages/Tables/xHhzi/src/fallbacks.jl:127
[7] columns at /home/tpoisot/.julia/packages/Tables/xHhzi/src/fallbacks.jl:237 [inlined]
[8] DataFrame(::ParquetFiles.ParquetFile; copycols::Bool) at /home/tpoisot/.julia/packages/DataFrames/GtZ1l/src/other/tables.jl:43
[9] DataFrame(::ParquetFiles.ParquetFile) at /home/tpoisot/.julia/packages/DataFrames/GtZ1l/src/other/tables.jl:34
[10] top-level scope at REPL[30]:1
Interestingly, AFAIK, the entire file is loaded, but saving to a DataFrame or CSV results in the same error being thrown. My guess is that the last line, somehow, has characters it should not?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Metadata
Assignees
Labels
No labels