TLTR: I have a label in training set I don’t want to appear in automatic segmentation.
Hi, I have a bunch of images with objects of the same kind with blurred edges. Think clouds, smoke or whatever. The images have been labelled by humans marking what is not the object or what it is the object. But the labelled image have a gap between both categories. So, I create a third fake category “I don’t know”. And fill spaces in the labelled image with this new category.
The model learned to mark objects, no object and the borders. Dammit. It was too literal from my labels.
My desire is the model to learn only from pixels marked as “is” and “is not” categories. And the output of segmentation only has that two categories. And never output the “I don’t know” category. If the training is good, the system will tell me where is the frontier.
What do I have done?
I copy from Inet a metric than measure images without count “I don’t know” category:
void_code = '0' # "I don't know" category
def acc_camvid(input, target):
target = target.squeeze(1)
mask = target != void_code
return (input.argmax(dim=1)[mask]==target[mask]).float().mean()
I know than metric isn’t used to train the model, the loss function is used for that. But I am not sure about how to use it. Diving into source code, i found a parameter in CrossEntropyFlat (the loss function used in segmentation) called weight used in loss calculation. So I put a 0 value to “I don’t know” category:
learn = unet_learner(data, models.resnet34, metrics=metrics, wd=wd)
learn.lossfunc = CrossEntropyFlat(weight = Tensor([0., 1., 1.]).cuda())
In my mind, a mistake in any pixel marked as “idk” doesn’t contribute to the loss value. The reality is the opposite. Now, the borders are bigger than before I touch the weight parameter. I still have pixels marked as “I don’t know” at the output of segmentation. And I ran off of ideas.
So:
- How can I force the system to ignore the “idk” mark when training?
- How can I force the system to not use the “idk” mark at the exit of the segmentation?
(At this time, I can’t re-label the images to avoid the gaps)
Thanks