Unet Segmentation Mask Converter to help against common errors / problems

Many seem to have problems with Segmentation and the used masks. Because of that I wrote some functions that can help anybody to overcome the problems. I did not test it thoroughly, only one RGB and one gray scaled image, where it worked as expected.

You only have to provide the path, where the labels / masks are as well as a path to a folder where you want to save the converted masks. The new files will be named as the old ones. Converted masks contain only values from 0 to N, where N is the number of classes -1.

Here the code:

%reload_ext autoreload
%autoreload 2
%matplotlib inline
from fastai.vision import *
from fastai.callbacks.hooks import *
import PIL.Image as PilImage

def getClassValues(label_names):

    containedValues = set([])

    for i in range(len(label_names)):
        tmp = open_mask(label_names[i])
        tmp = tmp.data.numpy().flatten()
        tmp = set(tmp)
        containedValues = containedValues.union(tmp)
    return list(containedValues)

def replaceMaskValuesFromZeroToN(mask, 

    numberOfClasses = len(containedValues)
    newMask = np.zeros(mask.shape)

    for i in range(numberOfClasses):
        newMask[mask == containedValues[i]] = i
    return newMask

def convertMaskToPilAndSave(mask, 

    imageSize = mask.squeeze().shape

    im = PilImage.new('L',(imageSize[1],imageSize[0]))

def convertMasksToGrayscaleZeroToN(pathToLabels,

    label_names = get_image_files(pathToLabels)
    containedValues = getClassValues(label_names)

    for currentFile in label_names:
        currentMask = open_mask(currentFile).data.numpy()
        convertedMask = replaceMaskValuesFromZeroToN(currentMask, containedValues)
        convertMaskToPilAndSave(convertedMask, saveToPath/f'{currentFile.name}')
    print('Conversion finished!')

Now you only have to use:

convertMasksToGrayscaleZeroToN(pathToLabels, saveToPath)

I did not tune the code for high performance, it should just do what hopefully helps a lot of people here.

Any suggestions for adding / changing parts are welcome.


thank you, could you please share the loss function that you have used as well …

You do not need a loss function, it is just a mask values conversion tool.

thank you, the mask converter is very helpful.


Thank you for your sharing code.
I would like to apply it on https://www.kaggle.com/c/imaterialist-fashion-2019-FGVC6 dataset
We have images and a dataset containing EncodedPixels
See below an example:

Other question :
I would like to save to disk combine masks into one mask. how to ?

Thanks for your help.


I find a solution here


1 Like

Hey, I tried the code above, finding out the mask showed is better than done by pycocotools, but still I got "RuntimeError: CUDA error: device-side assert triggered " when I run β€œlr_find(learn);
Have you fixed that?

Yes, i am not getting this error. Please make sure that your converted mask contain only values 0 to (number of classes - 1).

Before I run convertMasksToGrayscaleZeroToN(), should I name the files in special form? Since I didn’t find class information in the methods above, right? Or, the masks are just the ones generated from json made by labelme?


I’m just picking up on this thread, is this specifically for those who are annotating using label me and generating polygons? I ended up using rectlabel that got me to PNG masks although i think they are between 0 and 255. I was wondering why the camvid dataset is in greyscale when i open it up but when i load it in the notebook it comes up coloured?

I have been trying to figure out why my masks look like this

But when i load into my notebook the colours are different?

masks are usually named with a suffix like β€œ_P” . In the notebook lambda function get_y_fn handles it . Masks generated from labelme will not be named differently, you need to handle it outside.

You must change the color map, then it can be displayed correctly. You can also use set() to check the entries of your masks and see, if the conversion did a correct job. As soon as the values in the masks are not 0 … N, it won’t work correctly. Loading sort of normalizes the value range, which we do not want.

ok so tried running this @ptrampert using code above - when i open up the finished conversion everything looks black? This is sticking with the same image from above with the black background and turqouise mask?

Edit: worth stating i used rectlabel to generate the original binary mask png.
I’m wondering if i have done something wrong during labelling here - the mask has clearly been created as it can be seen in the PNGs from the earlier post.

Use e.g. ImageJ / Fiji to look into the masks and adapt the displayed pixel range to 0 … N. Otherwise, most tools will show the whole possible value range, which means for 8 bit that (in a binary case) 0 and 1 are both black. When you load the mask in Python and use set() it should have the values 0 … N.

ok so in this case, given this is a binary segmentation task (object, background) i need to adapted the displayed pixel range from 0 to 1. I’m sorry if i have misunderstood - i thought that was what your script did? I will look at using imageJ/Fiji to do this this afternoon! Thank you for the help @ptrampert

Just for the displayed value range you have to adapt it. My function handles it correctly, so that the saved data is what you need. However, as a program in general does not know the pixels contained in an image, it will try to show the whole bit range. For the data processing everything is as it should be after having applied the function.

ok i will try and adapt the displayed value range using imageJ today and report back! Hopefully this will be useful for others as this is a bit of a logistical headache.

Ok so a couple of interesting findings @ptrampert

I used image J to convert the masks to binary which apppears to have worked (when i hover over the regions i get either β€œ0” or β€œ1” where it was β€œ0” or β€œ255”. If i save as PNG and then re-open it appears to be back as 0 and 255, if i save as .tiff and re-open seems to remain as 0 and 1. However, when i run the tifs through the script the output is still completely black again. If i open these compeltely black images up in image J and move the cursor to the masked area it does show as β€œ1” vs β€œ0” where the background should be. Is this expected behaviour or should i be seeing the masks as greyscale?

There is one important difference: (i) which values an image contains, and (ii) which values respectively value range is displayed.

In ImageJ, depending on the selected value range to display the images, respectively values, are saved. That means, if the displayed range is 0 to 1 and then you save the image, it will transform the minimum and maximum values to 0 and 255.

Hence, you can only use the displayed range on a restricted range for displaying, but not for saving. For an 8bit Image you have to choose the range 0 to 255 for saving, even if you do then see a black image for binary 0 and 1 values in the image. Because of similar reasons there are all these problems in python when using more than 2 values in an image.

Hope that clarifies the issue.

Ok this does clarify a lot (thank you) in terms of the reasoning behind image J / Python implementtion of the task.

I am still slightly unsure of whether the mask images i have generated are in the right format to use for image segmentation / how i can get what i have to the right format? Are the 0/1 Tiffs which get processed by your script into what appears to the naked eye as a flat black image PNGs (that seem to still contain the 0/1s) ready to use as a camvid style dataset?

I do apologise if these questions are very naive - this is my first time in creating a dataset from scratch @ptrampert and i did not anticipate so many issues in preprocessing the data before actually getting ready to train!