Description
#3624 was recently merged and we identified a potential issue: #3624 (comment)
In short, the rois tensor contains indices in the first column, but depending on the quantization, some indices cannot be properly represented. For example uneven numbers can't be represented if the tensor was quantized with qscale = 2
.
To prevent any potential bug, we currently force the batch size to be 1 and hard-code the index to 0:
vision/torchvision/csrc/ops/quantized/cpu/qroi_align_kernel.cpp
Lines 156 to 158 in 07fb8ba
vision/torchvision/csrc/ops/quantized/cpu/qroi_align_kernel.cpp
Lines 39 to 40 in 07fb8ba
We should try to allow more than one element per batch. A potential solution would involve using per-channel quantized tensors for the roi tensor, where the first column containing the indices would be quantized in a different way from the rest of the columns.
In roi_align python op:
- if a tensor with 5 columns is passed, raise an error if it's not per-channel: there's a high changes the indices are wrong and it's too risky. If the tensor is per-chanel, pass it through: we can assume that the user knows what they're doing and that the indices are properly represented. As a good sanity check, we can still check that the batch size is within the range of the quantized type of the first column.
- if a list of tensors is passed, convert that list of tensors into a per-channel quantized tensor with 5 columns.
The convert_boxes_to_roi_format
utils should be modified. To ensure consistency throughout the library, it should also be used in MultiScaleRoIAlign
.