DataBlock summary varies on how it takes images for final sample

It’s an image segmentation problem. The summary sometimes prints a tuple with the correct input and its mask as a final sample. However, other times it takes the same mask both times forming a tuple containing two identical masks. The first one as a PILImage, the second as a PILMask.

I am guessing that there is a problem with how DataBlock reaches input and targets.

My directory structure is:

  • The main directory ‘photos’ holding all folders with pictures and masks
  • directories v1 to v8 with an image and its mask
  • directories ‘image’ and ‘mask’ in every vX holding corresponding data.
    My DAtaBlock code:
field = DataBlock(blocks=(ImageBlock, MaskBlock(codes)),
                get_items=get_image_files,
                splitter=RandomSplitter(),
                get_y=get_msk,
                batch_tfms=[*aug_transforms(size=quarter)])

where codes is np.loadtxt(str(train_path)+'codes.txt', dtype='str') None or Field, train_path is ‘photos/’
get_msk is lambda o: train_path+'{}/mask/mask.tif'.format(o.parts[1])

Correct formation of final sample:

Building one sample
  Pipeline: PILBase.create
    starting from
      photos/v4/image/S1_VV_D10_AD_median_2019_08_11_visas_v4.tif
    applying PILBase.create gives
      PILImage mode=RGB size=4000x4000
  Pipeline: <lambda> -> PILBase.create
    starting from
      photos/v4/image/S1_VV_D10_AD_median_2019_08_11_visas_v4.tif
    applying <lambda> gives
      photos/v4/masks/mask.tif
    applying PILBase.create gives
      PILMask mode=L size=4000x4000

Final sample: (PILImage mode=RGB size=4000x4000, PILMask mode=L size=4000x4000)

Incorrect formation of the final sample:

Building one sample
  Pipeline: PILBase.create
    starting from
      photos/v4/masks/mask.tif
    applying PILBase.create gives
      PILImage mode=RGB size=4000x4000
  Pipeline: <lambda> -> PILBase.create
    starting from
      photos/v4/masks/mask.tif
    applying <lambda> gives
      photos/v4/masks/mask.tif
    applying PILBase.create gives
      PILMask mode=L size=4000x4000

Final sample: (PILImage mode=RGB size=4000x4000, PILMask mode=L size=4000x4000)

When creating dataloaders I put batch size as the number of all training samples for I want to see all of them when calling show_batch() method. The method shows at most one correct input and mask pair. Other masks are blank or it doesn’t show correct pairs at all.

I created PILMask from Path() to see if there was a problem with data. There was none, all maps were shown correctly.

I would like to get some help in dealing with this issue. Please, feel free to ask for information if necessary.