Formalize axis matching rules

After much thoughts, I have progressed only a little.

For context, the central point in all this is the behaviour of AxisCollection.union, which is used in stack, binary ops (via make_numpy_broadcastable) and getitem (via broadcast_with).

Given arrays:
arr1: a (2) x b (3) x {2} (4) x d (4)
arr2: c (4) x a (2) x {2} (4) x {3} (3)


How do we match axes? What I am convinced of so far is:

* axes with the same name should always match in the absence of duplicate name.
* we should **not** use axes **absolute** positions as "temporary name" for anonymous axes, as this must work
arr1: a (2) x b (3) x c (4)
arr2: a (2) x {1} (3) x {2} (4)

Here are a few options:

* first match named axes common in both then match the remaining axes by **relative** position:
  - arr1: a (2) x b (3) x c (3)
    arr2: {0} (3) x {1} (3) x a (2) x d (4) 
    => match a, that leaves us with:
    arr1: b (3) x c (3)
    arr2: {0} (3) x {1} (3)
    => result is a (2) x b (3) x c (3) x d (4)

  - arr1: a (2) x b (3) x c (3)
    arr2: d (4) x {1} (3) x {2} (3) x a (2)
    => match a, that leaves us with:
    arr1: b (3) x c (3)
    arr2: d (4) x {1} (3) x {2} (3)
    => result is error d(4) is incompatible with b(3)
  
  **problem**: we need 
  - arr1: a (2) x b (3) x c (3)
    arr2: d (4)
    to result in a (2) x b (3) x c (3) x d (4)

* current situation: self[axis] (=> axis.equals for anonymous) then by absolute position. Equals => anonymous cannot match non anonymous except by **absolute** position.

* match with **first compatible** axis

  res = self[:]
  for other_axis in other_axes:
      if not any(axis.iscompatible(other_axis) for axis in self):
          res.append(other_axis)
 
  - arr1: a (2) x b (3) x c (3)
    arr2: {0} (3) x {1} (3) x a (2) x d (4) 
    => result is a (2) x b (3) x c (3) x d (4) (assuming no wildcard axis)
  problem: several axes can match the same axis which will break (probably in transpose
     eg. if {0} and {1} are wildcard axes, the result will probably be an error somewhere because it is ambiguous whether {0}* refers to b or c
    => an anonymous wildcard axis of length 1 would match with all axes so it would always generate ambiguous errors.
    => unsure it plays well with alignment by default.

* match with **first unmatched compatible** axis (ie by compatible and relative position)
  **problem**: can prevent a named axis from matching with a corresponding named axis, if the first axis in other is compatible with. 

* first match all named axes (and align them) then match remaining axes with first unmatched compatible axis (ie by compatible and relative position)

  problem: we can still get some annoying results. eg, suppose that after matching named axes, we are left with the following anonymous (but non- wildcard except for 1) axes: 
  arr1: 4 5 2
  arr2: 1* 4 5
  arr2.1* would be matched with arr1.4 and then arr2.4 will fail to be matched with anything even though it got an exact copy of itself in arr1.

* first match all named axes (and align them) then match remaining non wildcard axes (or only exclude wildcard of length 1?) with first unmatched compatible axis (ie by compatible and relative position) then match wildcard (or only length 1 wildcards?) axes to anything left.  I need to throw a few use cases at this algorithm and see how it behaves. 
   problem: this is getting complex to explain (I think the implementation would not be too bad).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Formalize axis matching rules #464

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Formalize axis matching rules #464

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions