Description
Hi,
I have some observations on the coco metrics, specially the precision metric that I would like to share.
it would be great if some could clarify these points :) /cc. @pdollar @tylin
for calculating precision/recall, I am calculating the COCO average precision to get a feeling with respect to the systems result. Also, here for better explaining the issue, I will also calculate these metrics considering all the observations as a whole (say as a large stitched image, and not many separate images), which I call here the overall recall/precision.
Case1. a system with perfect detection + one false alarm: in this case as detailed in the next figure, the coco average precision comes out to be 1.0, which is completely ignoring the false alarm's existence!
Case2. a system with zero false alarms: in this case, we have no false alarms, and thus, the overall precision is perfect at 1.0; however, the coco precision comes out as 0.5! This case is very important since it could mean that the coco average precision is penalizing systems with no false alarms, and favoring the detection part of a system in evaluation? As you may know systems with zero/small false alarms are of great importance in industrial applications
So I am not sure if the above cases are bugs or are intentionally decided for coco, or if I am missing something?