According to cityscapes benchmark, “weighting the contribution of each pixel by the ratio of the class’ average instance size to the size of the respective ground truth instance”. I do not get what this mean. Do anyone know how to calculate iIoU?Thanks
I realize this is an old question, but I was also wondering and this was hit#1 on google, so here it is:
It is well-known that the global IoU measure is biased toward object instances that cover a large image area. In street scenes with their strong scale variation this can be problematic. Specifically for traffic participants, which are the key classes in our scenario, we aim to evaluate how well the individual instances in the scene are represented in the labeling. To address this, we additionally evaluate the semantic labeling using an instance-level intersection-over-union metric iIoU = iTP ⁄ (iTP+FP+iFN). Again iTP, FP, and iFN denote the numbers of true positive, false positive, and false negative pixels, respectively. However, in contrast to the standard IoU measure, iTP and iFN are computed by weighting the contribution of each pixel by the ratio of the class’ average instance size to the size of the respective ground truth instance. It is important to note here that unlike the instance-level task below, we assume that the methods only yield a standard per-pixel semantic class labeling as output. Therefore, the false positive pixels are not associated with any instance and thus do not require normalization.