-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StackOverflow with stackdf #1251
Comments
Works here. Are you on master of both DataFrames and Nulls? |
Yea, still the same issue julia> Pkg.update()
... skipping for brevity...
julia> Pkg.status("Nulls")
- Nulls 0.1.1 master
julia> Pkg.status("DataFrames")
- DataFrames 0.10.1+ master
julia> using DataFrames, CSV
INFO: Recompiling stale cache file /Users/Cameron/.julia/lib/v0.6/DataFrames.ji for module DataFrames.
INFO: Recompiling stale cache file /Users/Cameron/.julia/lib/v0.6/CSV.ji for module CSV.
julia> iris = CSV.read(joinpath(Pkg.dir("DataFrames"), "test/data/iris.csv"));
julia> d = stackdf(iris);
ERROR: StackOverflowError:
Stacktrace:
[1] zero(::Type{Any}) at /Users/Cameron/.julia/v0.6/Nulls/src/Nulls.jl:70 (repeats 80000 times) |
OK, that only happens when using CSV from master, because it happens when there is no numeric column in the data frame. MWE: show(DataFrames.StackedVector(Any[]))
# Or:
show(DataFrames.StackedVector(Int[])) EDIT: even better size(DataFrames.StackedVector(Int[]))
sum(map(length, DataFrames.StackedVector(Int[]).components)) |
OK, that's not new. With 0.10.1: julia> stackdf(df)
ERROR: StackOverflowError:
Stacktrace:
[1] zero(::Type{Any}) at /home/milan/.julia/DataArrays/src/natype.jl:71 (repeats 80000 times) We just need to ensure we always return 0 even when the array is empty. Something like this could do the trick: Base.length(v::StackedVector) = mapreduce(length, +, 0, v.components) |
I'm not sure this is an issue with julia> sum(Any[])
ERROR: StackOverflowError:
Stacktrace:
[1] zero(::Type{Any}) at /Users/Cameron/.julia/v0.6/Nulls/src/Nulls.jl:70 (repeats 80000 times)
|
Base only provides `zero(::Type{T}) where T<:Number`, not `zero(::Type{T}) where T`, and our method results in StackOverflows when called with `Any` when it should return an error. See JuliaData/DataFrames.jl#1251
Implementing Base.length(v::StackedVector) = mapreduce(length, +, 0, v.components) fixes all but sum(map(length, DataFrames.StackedVector(Int[]))) and I'm not sure that we can do anything about this one because at that point we'll be using Base types (because the component Arrays won't be sum(map(length, DataFrames.StackedVector(Int[]).components)) Any ideas as to why julia> length(DataFrames.StackedVector(Int[]))
0
julia> map(length, DataFrames.StackedVector(Int[]))
0-element Array{Any,1} |
We don't care about I don't see what you mean by "missing a value". The result of |
Closing because the original issue was resolved by JuliaData/Missings.jl#44 |
Maybe we should add a test case? |
Is this still an issue? |
The text was updated successfully, but these errors were encountered: