Multi-class semantic segmentation metrics and accuracy

Hello,

In the introductory fastai lessons, the “error_rate” metric is used to track the progress of our single-class classification model. For semantic segmentation problems, the most commonly used metric to evaluate the progress of model training is the “Intersection over Union” value (IoU), which is also referred to as the “Jaccard index”.

Within the fastai library, in the fastai.metrics module, I have identified two functions that may provide metrics for image segmentation: foreground_acc and dice. The iou=True kwarg may be passed to the dice function to replicate the IoU metric, but only for binary targets. This does not work for an image segmentation problem where multiple classes are identified, such as in the CamVid dataset used in the fastai course Lesson 3: Fast AI Lesson 3 with Semantic Segmentation. The foreground_acc function is identical to the acc_camvid function that Jeremy uses in the video to compute accuracy on the CamVid dataset specifically.

My question is, is the acc_camvid accuracy function generalizable to all multi-class image segmentation problems? If I desire the IoU metric instead (to benchmark against examples provided in research papers), how can I compute the IoU metric for multi-class image segmentation?

Before coming here, I first stumbled upon the following jaccard_loss function, provided at pytorch-goodies.

def jaccard_loss(true, logits, eps=1e-7):
    """Computes the Jaccard loss, a.k.a the IoU loss.
    Note that PyTorch optimizers minimize a loss. In this
    case, we would like to maximize the jaccard loss so we
    return the negated jaccard loss.
    Args:
        true: a tensor of shape [B, H, W] or [B, 1, H, W].
        logits: a tensor of shape [B, C, H, W]. Corresponds to
            the raw output or logits of the model.
        eps: added to the denominator for numerical stability.
    Returns:
        jacc_loss: the Jaccard loss.
    """
    num_classes = logits.shape[1]
    if num_classes == 1:
        true_1_hot = torch.eye(num_classes + 1)[true.squeeze(1)]
        true_1_hot = true_1_hot.permute(0, 3, 1, 2).float()
        true_1_hot_f = true_1_hot[:, 0:1, :, :]
        true_1_hot_s = true_1_hot[:, 1:2, :, :]
        true_1_hot = torch.cat([true_1_hot_s, true_1_hot_f], dim=1)
        pos_prob = torch.sigmoid(logits)
        neg_prob = 1 - pos_prob
        probas = torch.cat([pos_prob, neg_prob], dim=1)
    else:
        true_1_hot = torch.eye(num_classes)[true.squeeze(1)]
        true_1_hot = true_1_hot.permute(0, 3, 1, 2).float()
        probas = F.softmax(logits, dim=1)
    true_1_hot = true_1_hot.type(logits.type())
    dims = (0,) + tuple(range(2, true.ndimension()))
    intersection = torch.sum(probas * true_1_hot, dims)
    cardinality = torch.sum(probas + true_1_hot, dims)
    union = cardinality - intersection
    jacc_loss = (intersection / (union + eps)).mean()
    return (1 - jacc_loss)

Where the true and logits parameters correspond to the ground truth image and the output of the Learner model, respectively.

Having the function return jacc_loss instead of 1-jacc_loss, this should theoretically give the IoU value of the current prediction. Can anyone back me up on this?

When I use the above stated jaccard_loss function, I am getting values of jacc_loss = 0.18 at the same time I am getting acc_camvid = 0.92. This does not make sense to me, as 0.92 for camvid accuracy is state of the art, while IoU state of the art is closer to 0.64. (see Refine-net )

TL;DR I am looking for a way to properly, and generically, report the accuracy of a multi-class image segmentation model.

Hi @eckelsjd,
I am currently investigating ways to evaluate my segmentation model. I’m quite a newbie so I cannot really say if your function corresponds to the IoU value.
Did you ever find an answer to your question?
What have you used so far for evaluation?
Thanks!!

Hi @nadees,
I did run some of my own tests to verify that the jaccard function provided above does do what it claims. If you briefly look through it, you can see at the bottom how the intersection and union values are calculated. The variable jacc_loss is then set equal to the mean “intersection” over “union” value, meaning it calculates the IoU for each class individually, then takes the average across all classes. This is exactly what we want for a multi-class segmentation evaluation metric. Until shown otherwise, I have been using this function as my IoU metric, but slightly modified, as shown below:

# Return Jaccard index, or Intersection over Union (IoU) value
def IoU(preds:Tensor, targs:Tensor, eps:float=1e-8):
    """Computes the Jaccard loss, a.k.a the IoU loss.
    Notes: [Batch size,Num classes,Height,Width]
    Args:
        targs: a tensor of shape [B, H, W] or [B, 1, H, W].
        preds: a tensor of shape [B, C, H, W]. Corresponds to
            the raw output or logits of the model. (prediction)
        eps: added to the denominator for numerical stability.
    Returns:
        iou: the average class intersection over union value 
             for multi-class image segmentation
    """
    num_classes = preds.shape[1]
    
    # Single class segmentation?
    if num_classes == 1:
        true_1_hot = torch.eye(num_classes + 1)[targs.squeeze(1)]
        true_1_hot = true_1_hot.permute(0, 3, 1, 2).float()
        true_1_hot_f = true_1_hot[:, 0:1, :, :]
        true_1_hot_s = true_1_hot[:, 1:2, :, :]
        true_1_hot = torch.cat([true_1_hot_s, true_1_hot_f], dim=1)
        pos_prob = torch.sigmoid(preds)
        neg_prob = 1 - pos_prob
        probas = torch.cat([pos_prob, neg_prob], dim=1)
        
    # Multi-class segmentation
    else:
        # Convert target to one-hot encoding
        # true_1_hot = torch.eye(num_classes)[torch.squeeze(targs,1)]
        true_1_hot = torch.eye(num_classes)[targs.squeeze(1)]
        
        # Permute [B,H,W,C] to [B,C,H,W]
        true_1_hot = true_1_hot.permute(0, 3, 1, 2).float()
        
        # Take softmax along class dimension; all class probs add to 1 (per pixel)
        probas = F.softmax(preds, dim=1)
        
    true_1_hot = true_1_hot.type(preds.type())
    
    # Sum probabilities by class and across batch images
    dims = (0,) + tuple(range(2, targs.ndimension()))
    intersection = torch.sum(probas * true_1_hot, dims) # [class0,class1,class2,...]
    cardinality = torch.sum(probas + true_1_hot, dims)  # [class0,class1,class2,...]
    union = cardinality - intersection
    iou = (intersection / (union + eps)).mean()   # find mean of class IoU values
    return iou

You’ll notice that the predicted masks (preds) and the ground truth masks (targs) have been switched in the function’s argument positions. By inspection of fastai v1’s source code for other evaluation metrics, this is how your IoU function will implicitly be called during training.

It’s a wonder to me why they have not already included this metric built-in to the library yet. This is only for fastai v1 though, I’m not sure what they’re doing in v2.

If you are following along with the fastai image segmentation tutorials, you would define this function at the top of your Jupyter notebook, then include the function handle when creating a U-Net learner object:

learn = unet_learner(data,models.resnet34,metrics=[IoU])

I hope this helps!

Hi @eckelsjd,
Thank you for sharing, it is really helpful!
I was using a custom accuracy function which I found here: https://forums.fast.ai/t/create-databunch-with-multiple-segmentation-mask-as-label/53643/20
However, I think IoU provides more information for segmentation problems. I might use several metrics.

I’m not finished with the v2 lessons yet, but I do not think they’ve taken something like this metric into account. Also, the options for evaluation, such a confusion matrices, are not really there for segmentation purposes.

So thanks again!