Image segmentation resizing

gionanide · April 28, 2023, 8:37am

Hello all,

I am solving an image segmentation task. Initially I have images let’s say of dimensions 1024 x 1024 and their corresponding ground truth segmentation masks 1024 x 1024 as well. When I pass batch_tfms=[*aug_transforms(size=np.array([size,size]))] where size could be 64, 128, 2048, etc. which resizing algorithms are applied to the input image and the corresponding ground truth segmentation mask? Are they the same?

Thank you in advance.

meanpenguin · May 15, 2023, 7:05pm

As Jeremy always says, “Try it an see”

But I do believe the if you are using the segmentation data block, they both are resized to the same size.

gionanide · May 18, 2023, 7:39am

Hello Edward,

Thank you for you answer. I understand that they both resized to the same size. But which algorithms are applied for this procedure? And is the emerging mapping (image pixel to label pixel) still valid? Are there any resources for this procedure?

meanpenguin · May 19, 2023, 3:48pm

I believe these are the defaults

github.com

fastai/fastai/blob/74fd6ea8cca582cc878d89658541873761ecb617/fastai/vision/augment.py#L1240


      
              aff_tfms = [tfm for tfm in tfms if isinstance(tfm, AffineCoordTfm)]
              lig_tfms = [tfm for tfm in tfms if isinstance(tfm, LightingTfm)]
              others = [tfm for tfm in tfms if tfm not in aff_tfms+lig_tfms]
              lig_tfm = _compose_same_tfms(lig_tfms)
              aff_tfm = _compose_same_tfms(aff_tfms)
              res = [aff_tfm] if aff_tfm is not None else []
              if lig_tfm is not None: res.append(lig_tfm)
              return res + others
          
          # %% ../../nbs/09_vision.augment.ipynb 262
          def aug_transforms(
              mult:float=1.0, # Multiplication applying to `max_rotate`,`max_lighting`,`max_warp`
              do_flip:bool=True, # Random flipping
              flip_vert:bool=False, # Flip vertically
              max_rotate:float=10., # Maximum degree of rotation
              min_zoom:float=1., # Minimum zoom 
              max_zoom:float=1.1, # Maximum zoom 
              max_lighting:float=0.2, # Maximum scale of changing brightness 
              max_warp:float=0.2, # Maximum value of changing warp per
              p_affine:float=0.75, # Probability of applying affine transformation
              p_lighting:float=0.75, # Probability of changing brightnest and contrast

In my experience, the transforms are applied appropriately to make the segmentation work if you are using the unet_learner.

Maybe this example may help?

VDM · May 20, 2023, 6:04am

I think I understood the question, which can be reformulated as: “do the transformations applied introduce artifacts in masks due to interpolation, like they do in RGB images?” Regarding resize, I am almost sure it does not introduce artifacts; regarding other transformations, I did not check, but masks remain valid thus at least the number of classes remains the same. However, artifacts would be mostly on the edges, and likely small.
You may check the source code to see which algorithm is used specifically on masks.