Improving Segmentation results on my own dataset

weblinker · December 31, 2020, 4:48pm

I’m a beginner and I’m currently trying to test out Fastai on how good the results i can get on my own data set, because I’m trying to see if i can do some kind of project with this for my diploma.
I’ve gotten 465 images of crossroads in size of 450x450 where i manually drew 2bit masks on them where the crossroads should be.
There are a few images in there as well where there aren’t any crossroads and the mask shows nothing.
Here are some pictures of how they look like:

Here is the code I’ve used right now.

from fastai.vision.all import *
import matplotlib.pyplot as plt
import numpy as np  
    
number_of_the_seed = 2020
random.seed(number_of_the_seed)
set_seed(number_of_the_seed)

torch.cuda.empty_cache()
path = Path('./crosswalks')

path_msk = path/'masks'
path_img = path/'images'

def get_mask_image(x):
    return f'{path_msk}/{x.stem} copy.png'

imageFileNames = get_image_files(path_img)
maskFileNames = get_image_files(path_msk)

for image in imageFileNames:
    img_f = get_mask_image(image)
    img = load_image(img_f)
  
codes = np.array(['Crossroad','Other'])

dls = SegmentationDataLoaders.from_label_func(path,
                                              bs=6,
                                              splitter=RandomSplitter(valid_pct=0.2,seed=2020),
                                              fnames = imageFileNames,
                                              label_func = get_mask_image,
                                              codes = codes,
                                              batch_tfms=[*aug_transforms(flip_vert=False, do_flip=True, size=200), Normalize.from_stats(*imagenet_stats)]
                                             )

name2id = {v:k for k,v in enumerate(codes)}
void_code = name2id['Other']

def mask_acc(input, target):
    target = target.squeeze(1)  
    mask = target != void_code 
    a = TensorImage(target[mask])
    return (input.argmax(dim=1)[mask]==a).float().mean()

learn = unet_learner(dls, resnet34, metrics=mask_acc)
learn.fine_tune(0, base_lr=7e-4, freeze_epochs=8)

With freeze_epochs=8 i managed to get get the best result of mask accuracy 70% which is still pretty terrible. As soon as i start training the rest of the layers it drops down to 40-60% for some reason .

I’ve tried resnet50 which gave me even worse results for some reason.
I’ve also tried increasing the size of the image incrementally, first from 64x64 to 124x124 to 200x200 but it didn’t really help.
Increasing the image size to 400x400 (which is the max my gpu can handle on resnet34) didn’t really do much better.
Does anyone have any suggestions on what i can still do to try and improve this?
Is it the small image size that is causing these problems or is it my sloppy mask drawing?
Should i try creating more images? Or maybe use a different pre-trained net? If so which one would you suggest?