Description
This issue relates to the transformations dispatch mechanism that doesn't recognize Functors as Functions as discussed on discourse .
I have a use case where I use Functors as pre-trained features transformations. In such context, defining those structs as sub-types of Function doesn’t seem a natural choice as a system.
Here’s a functor that applies learned normalization:
using DataFrames
using Statistics: mean, std
struct Normalizer
μ
σ
end
Normalizer(x::AbstractVector) = Normalizer(mean(x), std(x))
function (m::Normalizer)(x::Real)
return (x - m.μ) / m.σ
end
function (m::Normalizer)(x::AbstractVector)
return (x .- m.μ) ./ m.σ
end
df = DataFrame(:v1 => rand(5), :v2 => rand(5))
feat_names = names(df)
norms = map((feat) -> Normalizer(df[:, feat]), feat_names)
The following doesn’t work:
transform(df, feat_names .=> norms .=> feat_names)
ERROR: LoadError: ArgumentError: Unrecognized column selector: "v1" => (Normalizer(0.5407170762469404, 0.1599492895436335) => "v1")
However, somewhat surprisingly, using ByRow does work:
transform(df, feat_names .=> ByRow.(norms) .=> feat_names)
5×2 DataFrame
Row │ v1 v2
│ Float64 Float64
─────┼───────────────────────
1 │ 0.0386826 0.479449
2 │ 0.919179 -1.61432
3 │ 1.05579 0.584841
4 │ -0.930937 0.854153
5 │ -1.08272 -0.304124
So to use the vectorized form, it seems like a mapping of the Functors into Functions is required:
norms_f = map(f -> (x) -> f(x), norms)
transform(df, feat_names .=> norms_f .=> feat_names)
5×2 DataFrame
Row │ v1 v2
│ Float64 Float64
─────┼───────────────────────
1 │ 0.0386826 0.479449
2 │ 0.919179 -1.61432
3 │ 1.05579 0.584841
4 │ -0.930937 0.854153
5 │ -1.08272 -0.304124
I can see that there’s a not too complicated way to circumvent the functor limitation through that remapping. Yet, isn’t it counterintuitive to see the Functor works in the ByRow but not in the vectorized case? Although dispatch happens differently under ByRow
, from a user perspective,
Having the opportunity to recognize Functors as Functions in the transform would be their most natural handling in my opinion.