Skip to content

A new "outer_similar" #22218

Open
Open
@ChrisRackauckas

Description

@ChrisRackauckas

While trying to get generic codes working for AbstractArrays, I realized that there's a pretty crucial abstraction missing: the ability to create the same array type with a different element type.

There is a distinction to be made here. copy(A) creates a copy, so its type is the same. You cannot change the element type. Using the constructors can require different syntax for each AbstractArray for clear reasons.

Now, the clear way to go seems to be similar, but similar is not guaranteed to return the same outer array type. For one thing, it returns a similar mutable array type, so a similar(::SArray)::MArray. In addition, similar(::SharedArray)::Array for a reason that doesn't make much sense to me, but maybe that's because I am using similar differently:

#5893
JuliaLang/LinearAlgebra.jl#271

Even here @tkelman noted:

I agree with the opinion that similar is underspecified and has conflicting uses, and choosing the type it should return is all too often a subjective judgement call or dependent on how many arguments it's called with. For example the case of what to return for similar(::SharedArray) has gone back and forth, ref #5893 and #12964, and I think either choice has left multiple people unsettled and disagreeing. That's a pretty good sign that something is off with the abstraction.

So I am proposing that we have a new function like similar which guarantees that the returned array matches in the outer type. While outer_similar(A) is essentially a copy, the main usages being for outer_similar(A,T) to change the element type and outer_similar(A,T,indices...) to change the size can be very useful in generic codes regardless of the arrays used.

Example

This comes up in many areas, and areas I have experienced it lately include DiffEq, numerical linear algebra, rootfinding, and neural nets. Essentially, the library code can be written all in terms of broadcasted and matrix operations. These scientific computing algorithms don't actually need to make use of indexing at all outside of a user's function: they just need to do "array-wide" operations. So as long as they can properly allocate arrays of the right type for their A .= B.*C type of algorithms, they will generally work on all of these array types (even "weird arrays" like GPUArrays which don't even have indexing defined).

Assume C is the user's input array that you want to match types with. Without this proposed outer_similar, there is no good way to define the pre-allocated A in a way that you know it matches the array types of the user input, since similar can change things. In simple algorithms, one can get by with copy. But if you want the algorithm to be unit-safe, for example here you would want to define B to match the user's input array but be unitless. In this case you would want to B=similar(C,T) where T is the unitless element-type of A. But this is where we run into the issue that similar isn't guarenteed to return the same type. If you did A = copy(C); B = similar(C,T), now you have two SharedArrays and an Array, or things like that. For StaticArrays, this is also an issue if you want to make an initial B in a mutable type which you update later, since this will silently change to an MArray instead of an SArray and worsen the performance (though in that case we will need to use A = B.*C, but that's a different issue). If we want this broadcast to be automatically distributed or parallel, this type of type-changing will cause "auto-parallelism of generic algorithms to fail".

This makes me happy because Julia is so so so close to having auto-parallelism of generic algorithms work, as shown by tests on DiffEq (SciML/DifferentialEquations.jl#176).

Current Workaround

There is a current workaround in some cases. You can pre-allocate and change types using broadcast. Want to make a unitless version? B = C./C. Need to promote the types? B = C .+ D. But this is an overly expensive way to get around the fact that similar is not the right abstract for this job.

I am open to ideas for a different name than outer_similar. It's bad, but I thought writing down this argument was better than waiting for a good name.

Metadata

Metadata

Assignees

No one assigned

    Labels

    arrays[a, r, r, a, y, s]

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions