Skip to content

Taking matrix transposes seriously #408

Closed
@StefanKarpinski

Description

@StefanKarpinski

Currently, transpose is recursive. This is pretty unintuitive and leads to this unfortunateness:

julia> A = [randstring(3) for i=1:3, j=1:4]
3×4 Array{String,2}:
 "J00"  "oaT"  "JGS"  "Gjs"
 "Ad9"  "vkM"  "QAF"  "UBF"
 "RSa"  "znD"  "WxF"  "0kV"

julia> A.'
ERROR: MethodError: no method matching transpose(::String)
Closest candidates are:
  transpose(::BitArray{2}) at linalg/bitarray.jl:265
  transpose(::Number) at number.jl:100
  transpose(::RowVector{T,CV} where CV<:(ConjArray{T,1,V} where V<:(AbstractArray{T,1} where T) where T) where T) at linalg/rowvector.jl:80
  ...
Stacktrace:
 [1] transpose_f!(::Base.#transpose, ::Array{String,2}, ::Array{String,2}) at ./linalg/transpose.jl:54
 [2] transpose(::Array{String,2}) at ./linalg/transpose.jl:121

For some time now, we've been telling people to do permutedims(A, (2,1)) instead. But I think we all know, deep down, that's terrible. How did we get here? Well, it's pretty well-understood that one wants the ctranspose or "adjoint" of a matrix of matrices to be recursive. A motivating example is that you can represent complex numbers using 2x2 matrices, in which case the "conjugate" of each "element" (actually a matrix), is its adjoint as a matrix – in other words, if ctranspose is recursive, then everything works. This is just an example but it generalizes.

The reasoning seems to have been the following:

  1. ctranspose should be recursive
  2. ctranspose == conj ∘ transpose == conj ∘ transpose
  3. transpose therefore should also be recursive

I think there are a few problems here:

  • There's no reason that ctranspose == conj ∘ transpose == conj ∘ transpose has to hold, although the name makes this seem almost unavoidable.
  • The behavior of conj operating elementwise on arrays is kind of an unfortunate holdover from Matlab and isn't really a mathematically justifiable operations, much as exp operating elementwise isn't really mathematically sound and what expm does would be a better definition.
  • It is actually taking the conjugate (i.e. adjoint) of each element that implies that ctranspose should be recursive; in the absence of conjugation, there's no good reason for transpose to be recursive.

Accordingly, I would propose the following changes to remedy the situation:

  1. Rename ctranspose (aka ') to adjoint – that is really what this operation does, and it frees us from the implication that it must be equivalent to conj ∘ transpose.
  2. Deprecate vectorized conj(A) on arrays in favor of conj.(A).
  3. Add a recur::Bool=true keyword argument to adjoint (née ctranspose) indicating whether it should call itself recursively. By default, it does.
  4. Add a recur::Bool=false keyword argument to transpose indicating whether it should call itself recursively. By default, it does not.

At the very minimum, this would let us write the following:

julia> A.'
4×3 Array{String,2}:
 "J00"  "Ad9"  "RSa"
 "oaT"  "vkM"  "znD"
 "JGS"  "QAF"  "WxF"
 "Gjs"  "UBF"  "0kV"

Whether or not we could shorten that further to A' depends on what we want to do with conj and adjoint of non-numbers (or more specifically, non-real, non-complex values).

[This issue is the second of an ω₁-part series.]

Metadata

Metadata

Labels

breakingThis change will break codeneeds decisionA decision on this change is needed

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions