Preprocess labels for image segmentation

Hey all,

I am doing image segmentation where the ground truth masks are JPG images, and the mask is red. I have only 2 classes. Loading this mask as-in gives me a CUDA error which google informs me is likely because the values in my tensor are larger than my number of classes (the tensor has a max of 85).

How can I process the mask to be only 0 or 1?

I am building my databunch like this:

src = (SegmentationItemList.from_folder(img_path)
   .filter_by_func(filter_HE)
   .split_by_folder(valid='test')
   .label_from_func(get_y_fn, classes=codes)
  )

data = (src.transform(get_transforms(), size=size, tfm_y=True)
    .databunch(bs=bs)
    .normalize(imagenet_stats))
1 Like

You can create a custom item list, where you override the open method:

from fastai.vision.image import open_image, pil2tensor, ImageSegment
class CustomSegmentationList(SegmentationItemList):
    def open(self, fn):
        x = open_image(fn, div=False).data
        x = pil2tensor(x, np.float32)
        return ImageSegment(x/x.max())

This is only a proposal, you should make it match your problem, but this is probably what you want to do.

3 Likes

Hi,

I want to train a segmentation model on my own data. I have created a csv file which has two columns, 1st is location to original image and 2nd one is location to mask image.

I have tried to create my own datagenerator but that doesn’t work. So I SegmentationItemList.from_df and I am getting an error possibly because the data is str because of the location and not an image. Can you help me?

Can you give me a sample of your code, what your csv looks like and the error stack trace ? It is a bit hard to diagnose your problem without those.

So this is one of the ways I am trying to implement custom data loader:

class NumbersDataset():
    def __init__(self, inputs, labels):
        self.X = inputs
        self.y = labels

    def __len__(self):
        return len(self.X)

    def __getitem__(self, idx):
        img_train = cv2.imread(self.X[idx])
        img_mask = cv2.imread(self.y[idx])
        img_train = cv2.resize(img_train, (224,224), interpolation = cv2.INTER_LANCZOS4) 
        img_mask = cv2.resize(img_mask, (224,224), interpolation = cv2.INTER_LANCZOS4) 
        return img_train, img_mask

X = list(df['input_img'])  # This contains the location of the input image 
y = list(df['mask_img'])  # This contains the location of the mask image

X_train, X_valid, y_train, y_valid = train_test_split(
     X, y, test_size=0.33, random_state=42)

dataset_train = NumbersDataset(X_train, y_train)
dataloader_train = DataLoader(dataset_train, batch_size=4, shuffle=True, num_workers=2)

dataset_valid = NumbersDataset(X_valid, y_valid)
dataloader_valid = DataLoader(dataset_valid, batch_size=4, shuffle=True, num_workers=2)

data = DataBunch(train_dl = dataloader_train, valid_dl = dataloader_valid)

leaner = unet_learner(data = data, arch = models.resnet34)

And I end up getting the error:

AttributeError: ‘NumbersDataset’ object has no attribute ‘c’

which I looked up and saw that it’s the number of classes. So I don’t know how to go around this.

I have different folders for human segmentation images, like one folder for dancing or cycling. So the images are in different folders and that’s why I want to create my own data loader.

Hi, I am doing image segmentation to be able to detect floors in a perspective image of a building. I have 512x512 .jpg images and their corresponding masks as .png. I have 17 classes for each number of floor including the background. I generated my masks from the Terminal using Labelme (http://labelme.csail.mit.edu/Release3.0/) and got .json files with their corresponding .png

After reducing batch_size, setting CrossEntropyFlat(axis=1) and many other steps, I still get the error

Expected input batch_size (9216) to match target batch_size (393216)

It is a problem with the np.array when reading tensor weights because I am only getting this:

src_size = np.array(mask.shape[1:])

src_size,mask.data

(array([512, 512]), tensor([[[0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      ...,
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0],
      [0, 0, 0,  ..., 0, 0, 0]]]))

, when in fact I should be getting an array of numbers from 1 to 17.
Anyone had to deal with Labelme to Fastai before? Would highly appreciate the help.

@daveluo @jeremy