To forbide a category in image segmentation

TLTR: I have a label in training set I don’t want to appear in automatic segmentation.

Hi, I have a bunch of images with objects of the same kind with blurred edges. Think clouds, smoke or whatever. The images have been labelled by humans marking what is not the object or what it is the object. But the labelled image have a gap between both categories. So, I create a third fake category “I don’t know”. And fill spaces in the labelled image with this new category.

The model learned to mark objects, no object and the borders. Dammit. It was too literal from my labels.

My desire is the model to learn only from pixels marked as “is” and “is not” categories. And the output of segmentation only has that two categories. And never output the “I don’t know” category. If the training is good, the system will tell me where is the frontier.

What do I have done?

I copy from Inet a metric than measure images without count “I don’t know” category:

void_code = '0'  # "I don't know" category

def acc_camvid(input, target):
    target = target.squeeze(1)
    mask = target != void_code
    return (input.argmax(dim=1)[mask]==target[mask]).float().mean()

I know than metric isn’t used to train the model, the loss function is used for that. But I am not sure about how to use it. Diving into source code, i found a parameter in CrossEntropyFlat (the loss function used in segmentation) called weight used in loss calculation. So I put a 0 value to “I don’t know” category:

learn = unet_learner(data, models.resnet34, metrics=metrics, wd=wd)
learn.lossfunc = CrossEntropyFlat(weight = Tensor([0., 1., 1.]).cuda())

In my mind, a mistake in any pixel marked as “idk” doesn’t contribute to the loss value. The reality is the opposite. Now, the borders are bigger than before I touch the weight parameter. I still have pixels marked as “I don’t know” at the output of segmentation. And I ran off of ideas.

So:

  • How can I force the system to ignore the “idk” mark when training?
  • How can I force the system to not use the “idk” mark at the exit of the segmentation?

(At this time, I can’t re-label the images to avoid the gaps)

Thanks

I continue with my research:

CrossEntropyFlat has another option: ignore_index


I tried
learn.lossfunc = CrossEntropyFlat(ignore_index = void_code)

But, in the predictions there are still “i don’t know” labels. :frowning:

@javiermm
Ignore_index doesn’t remove the category from your model. It only ignores it while calculating loss, that is, your gradients are not contributed towards through the category that you asked the loss funciton to ignore.
I will point out the general approach you can use(Correct me if this is not the approach you’re looking for). This is what I would have first thought of, if I were in your place.
Let the model train as usual, with the ‘idk’ category as well.
When you infer from your model, notice that prediction works on softmax probabilities usually(that is what segmentaion uses in this case). So if a model predicts a pixel as category A, the probability of class A is the highest, followed by other classes, with lesser probabilities.

learn.predict returns predicted class, as well as probabilities. So you’ll have to write a small code, that changes predictions as follows:
Wherever the predicted class is the ‘i dont know’ category in the prediction tensors, replace it with the second highest probability, and its corresponding category. That is, you’re asking the model to tell you, what is the next category that it is sure of, that the pixel might belong to. Now use that for viewing images, and you’ll get a prediction with only the “non-idk” categories.

Now, I’m assuming you’re only doing this for an informal project, not for any formal, academic/corporate projects, because its not the mathematically correct way of doing things. The ideal approach would be to simply get rid of all void categories in your data, and replacing all the voids with a proper category. You cant tell your model to leave some pixels simply uncategorized, unless you define a way to simply not feed those pixels into the model. But again, I think thats not what you’re looking for, because you do want those pixels to have some category, that is not the void class.

Hope this helps. CHeers!

My segmentation is for images. So, if I have to look for the second-best category for each pixel… I can’t optimize that enough.

My fourth approach (and a working one): manual modification of bias parameter in the last layer network. Three categories, three bias values. A very low negative number avoid the activation.

So:

learn2 = copy.deepcopy(learn)
nbias = learn2.model.layers[-1][0].bias.data
nbias[void_code] = -500
learn2.model.layers[-1][0].bias.data = nbias

I cloned learn to be able to compare before and after modification.