-
Couldn't load subscription status.
- Fork 374
Description
This issue is a question on the most recent release notes
On Julia 1.7 or newer broadcasting assignment into an existing column of a data frame replaces it. Under Julia 1.6 or older it is an in place operation. (#3022)
I expected df.col .= v to broadcast and do in-place assignment. But I see that's no longer the case in Dataframes.jl 1.4.
The recent release broke some code of mine (I must have missed any deprecation warnings). A simple workaround was coltemp = df.col; coltemp .= v but I don't understand the reason for the new behaviour. To me this seems to make DataFrame inconsistent with other containers in Julia and left me wondering why this inconsistency would be a wanted one.
This issue equally applies to df[!, :x] .= v.
Compare:
julia> x = [1, 2, 3]
3-element Vector{Int64}:
1
2
3
julia> x .= 1.5
ERROR: InexactError: Int64(1.5)
Whereas
julia> df = DataFrame(x=[1, 2, 3])
3×1 DataFrame
Row │ x
│ Int64
─────┼───────
1 │ 1
2 │ 2
3 │ 3
julia> x = df.x;
julia> df.x .= 1.5
3-element Vector{Float64}:
1.5
1.5
1.5
julia> x === df.x
false
As advertised df.x .= 1.5 does not work in-place but replaces the column, even with a new type.
If I put the vector in any other container, say, a Dict, NamedTuple or struct
julia> dt = Dict(:x=>[1, 2, 3])
Dict{Symbol, Vector{Int64}} with 1 entry:
:x => [1, 2, 3]
julia> dt[:x] .= 1.5
ERROR: InexactError: Int64(1.5)
julia> nt = (x = [1, 2, 3],)
(x = [1, 2, 3],)
julia> nt.x .= 1.5
ERROR: InexactError: Int64(1.5)
julia> struct S
x
end
julia> s = S([1, 2, 3])
S([1, 2, 3])
julia> s.x .= 1.5
ERROR: InexactError: Int64(1.5)
They all behave the same. But a DataFrame behaves differently. Why is that?
The docs state "Since df[!, :col] does not make a copy" which to me makes it unexpected that it would create a new column rather than modifying the existing one.
For the use case of "create/replace column" we have df.x = v (akin to s.x = v or dict[:x] = v). Would there be any adverse side-effects of letting = broadcast scalars into new/replaced columns?
I understand there was a decision a year ago (#2804) to make df.x .= v work like d[!,:x] .= v but wouldn't a change to instead make df[!,:x] .= v work like df.x .= v have been more consistent with how containers in Julia typically work?