Some PNG format lable image is saved in mode ‘P’ so also needs to be open in mode ‘P’ rather than ‘L’, which is the default behavior of MaskBlock
or PIL. create
. The problem is, when opening a mask in ‘L’ mode, the label is not in desired range from [0, n_classes] which is what fastai expected, e.g. when you open a PASCAL VOC 2012 label with PILMask.crete
fn = "/home/ryf_stu01/fastai/downloads/VOC2012/VOCdevkit/VOC2012/SegmentationClass/2010_001951.png"
np.unique(PILMask.create(fn))
you get:
array([ 0, 147, 150, 220], dtype=uint8)
which is no good as a label-ish thing for segmentation with 21 classes.
one short way to fix it is :
class MyPILMask(PILBase): _open_args,_show_args = {'mode':'P'},{'alpha':0.5, 'cmap':'tab20'}
then run the code again you will get:
[in] np.unique(PILMask.create(fn))
[out]: array([ 0, 15, 19, 255], dtype=uint8)
This time the results look as we expected because voc 2012 has 21 classes and 255 represents void(or empty)
Then when creating Datablock, just do like this:
PILMask._open_args = {'mode':'P'}
voc2012 = DataBlock(blocks = (ImageBlock, MaskBlock(codes=codes)),
get_items = get_trainval_fanme,
get_y = get_label,
item_tfms = Resize(224),
batch_tfms= aug_transforms())
This works because MaskBlock rely on PILMask.creat
and PILMask.creat
itself relys on _open_args
which contains open mode. so we change it before it access it. Hope this helps