Description
Missing broadcast causes abort.
So, I ran into a bug where the optimized kernel library, op_add, op_sub, and op_div are missing broadcast support. I specifically ran into it when trying to add a 2x8x12x12 tensor with a 2x1x12x12 tensor. (I only ran into it when I was doing batch size of greater than 1, which is my assumption why this has not been an issue?)
The cause of this seems to be an update a while back that presumably added new optimization paths for use with op_mul:
enum class ElementwiseOptimizedPath {
kNone,
kTreatAs1d,
kBroadcast2dBy1d,
kBroadcast2dBy1dReverseArguments,
kBroadcastNdByNd,
kBroadcastNdByNdReverseArguments,
kBroadcastLastDim,
kBroadcastLastDimReverseArguments,
};
However, the first 4 are the only ones known in op_add, op_sub, and op_div, and the logic falls to an error if it encounters an unexpected optimization path.
Playing with it, a quick and dirty fix is to add a check for if we are using one of the un-implemented broadcasts, then fallback to kNone
auto selected_optimized_path = select_optimized_path(a, b, out);
if (selected_optimized_path == ElementwiseOptimizedPath::kBroadcastNdByNd ||
selected_optimized_path == ElementwiseOptimizedPath::kBroadcastNdByNdReverseArguments ||
selected_optimized_path == ElementwiseOptimizedPath::kBroadcastLastDim ||
selected_optimized_path == ElementwiseOptimizedPath::kBroadcastLastDimReverseArguments) {
selected_optimized_path = ElementwiseOptimizedPath::kNone;
}
That is obviously non-ideal, but was my temp fix for myself.
Versions
Executorch main branch.