Description
@mhvk noticed here that the definitions here are incompatible with NumPy gufuncs which I think is an oversight that was never noted/discussed. The "after broadcasting" looks reasonable on first sight, but TBH, doesn't really seem reasonable to me:
Assume broadcasting does something, you are specifying an axes that by definition has length 1 along the array, and also by definition has little meaning for the array which gets broadcast.
Now, is the alternative of using the non-broadcast axes better? Maybe not, maybe it is also very unclear and any positive axes (i.e. where broadcasting matters) is just too strange.
But, whether you like what NumPy does or not after thinking about it, it does seem if anything at least (maybe moreso) logical than the alternative.
So, I propose to change any place where broadcasting is mentioned w.r.t. to axes handling to define it as unspecified. Negative axis remain allowed, as it doesn't interfere with broadcasting. Negative axis beyond the number of existing dimensions prior to broadcasting will also be unspecified.
The last point looks necessary to me and strengthens the argument for NumPy's (or unspecified/error) logic: matmul([1], [[2]], axis=(-2, -1))
should not be valid, since it might as well be a bug that the first array is not 2-D as it should be).
EDIT: The above example should use vecdot
, matmul doesn't work: vecdot(1, [2], axis=-1)