Model that accepts an image and a segmentation-map as inputs

Hi!
I would like to build a model that uses the images of the prostate-cancer competition on kaggle: https://www.kaggle.com/c/prostate-cancer-grade-assessment/data

Here we get a tiff with the biopsy sample and an image-mask. The model should infere the cancer class on the isup_scale.

I have build a custom datablock (see this blog-post: https://asvcode.github.io/MedicalImaging/medical_imaging/prostate/kaggle/segmentation/2020/06/25/Selective-Mask.html):

def custom_img(fn):
    fn = f'{train}/{fn.image_id}.tiff'
    #print(fn)
    try:
      file = openslide.OpenSlide(str(fn))
    except Exception as e:
      print(e)
    t = tensor(file.get_thumbnail(size=(255, 255)))
    img_pil = PILImage.create(t)
    return img_pil

def custom_selective_mask(fn):
    fn = f'{mask}/{fn.image_id}_mask.tiff'
    file = openslide.OpenSlide(str(fn))
    t = tensor(file.get_thumbnail(size=(255, 255)))[:,:,0]
    ts = show_selective(t, min_px=None, max_px=None)
    return ts

blocks = (ImageBlock,
          MaskBlock,
          CategoryBlock)

getters = [
           custom_img,
           custom_selective_mask,
           ColReader('isup_grade')
          ]

dblock = DataBlock(blocks=blocks,
                   getters=getters,
                   splitter=RandomSplitter(),
                   item_tfms=[Resize(224), ToTensor],
                   batch_tfms=[IntToFloatTensor, Normalize.from_stats(*imagenet_stats)])

dl = dblock.dataloaders(train_labels, bs=4)
dl.n_inp = 2
dl.show_batch(max_n=4)

Now I think I have to create a custom model with a forward function that accepts two items and stacks them into one tensor so that the model can use the image and the mask for learning.

class ProstateCancerModel(Module):
  def __init__(self, encoder, head):
    self.encoder, self.head = encoder, head

  def forward(self, x1, x2):
    ftrs = torch.cat([self.encoder(x1), self.encoder(x2)], dim=1)
    return self.head(ftrs)

def loss_func(out, targ):
    return CrossEntropyLossFlat()(out, targ.long())

def siamese_splitter(model):
    return [params(model.encoder), params(model.head)]

encoder = create_body(resnet34, cut=-2)
head = create_head(512*2, 2, ps=0.5)
model = ProstateCancerModel(encoder, head)

Each item in the dataloader is now a tuple with the image, the image mask and the label:

dl.train_ds[0]

/content/prostate-cancer-data/train_images/002a4db09dad406c85505a00fb6f6144.tiff
(PILImage mode=RGB size=186x255,
 PILMask mode=L size=186x255,
 TensorCategory(0))

When I try to build the learner and fit for some epoches I get an error according a mismatch on dimensions…

learner = Learner(dl,
                  model,
                  loss_func=loss_func,
                  splitter=prostate_cancer_splitter,
                  metrics=accuracy
                  )
learner.freeze()
learner.fit_one_cycle(5)

/content/prostate-cancer-data/train_images/00412139e6b04d1e1cee8421f38f6e90.tiff
/content/prostate-cancer-data/train_images/0005f7aaab2800f6170c399693a96917.tiff
/content/prostate-cancer-data/train_images/0032bfa835ce0f43a92ae0bbab6871cb.tiff
/content/prostate-cancer-data/train_images/000920ad0b612851f8e01bcc880d9b3d.tiff
/content/prostate-cancer-data/train_images/003d4dd6bd61221ebc0bfb9350db333f.tiff
/content/prostate-cancer-data/train_images/001d865e65ef5d2579c190a0e0350d8f.tiff
/content/prostate-cancer-data/train_images/002a4db09dad406c85505a00fb6f6144.tiff
/content/prostate-cancer-data/train_images/004dd32d9cd167d9cc31c13b704498af.tiff
/content/prostate-cancer-data/train_images/004f6b3a66189b4e88b6a01ba19d7d31.tiff
/content/prostate-cancer-data/train_images/001c62abd11fa4b57bf7a6c603a11bb9.tiff
/content/prostate-cancer-data/train_images/0076bcb66e46fb485f5ba432b9a1fe8a.tiff
/content/prostate-cancer-data/train_images/003a91841da04a5a31f808fb5c21538a.tiff
/content/prostate-cancer-data/train_images/004391d48d58b18156f811087cd38abf.tiff
/content/prostate-cancer-data/train_images/006f4d8d3556dd21f6424202c2d294a9.tiff
/content/prostate-cancer-data/train_images/0018ae58b01bdadc8e347995b69f99aa.tiff
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-233-c0f1748717fa> in <module>()
      6                   )
      7 learner.freeze()
----> 8 learner.fit_one_cycle(5)

20 frames
/usr/local/lib/python3.7/dist-packages/torch/tensor.py in __torch_function__(cls, func, types, args, kwargs)
    993 
    994         with _C.DisableTorchFunction():
--> 995             ret = func(*args, **kwargs)
    996             return _convert(ret, cls)
    997 

RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 7, 7], but got 3-dimensional input of size [4, 224, 224] instead

So the input provided is a batch of size 4 with an image of 224x224 (imagenet resolution). I think this is correct. But I cannot figure out, where this expected input dimension comes from?

A tensor for a 2D color image should have four dimensions when passed to the model.
B x C x W x H where
B = batch size
C = color channels
H = height
W = width

This is why the model expected 4D input.
The problem is probably the mask. I assume it is a binary mask and has dimensions B x W x H. This makes sense when passing to the loss function, which expects the target to be one dimension smaller than the predictions. But when passing to a model it leads to problems.

A possible solution could be to add a specific mask transform:

@transforms
def extend_mask(x:TensorMask):
    return x.unsqueeze(1)

This will give you another error about size mismatch, as you now pass a tensor of size B x 1 x W x H, but the model expects B x 3 x W x H. Try adapting the first model layer, or stack the mask three times in the above given transformation.

1 Like

Thanks! I have fixed the problem with the dimensions but I now I ran into another problem.

Fixed model class:

class ProstateCancerModel(Module):
  def __init__(self, encoder, head):
    self.encoder, self.head = encoder, head

  def forward(self, x1, x2):
    enc1 = self.encoder(x1)
    enc2 =  self.encoder(
                          torch.stack((x2.unsqueeze(1), x2.unsqueeze(1), x2.unsqueeze(1)), dim=1).squeeze().float()
                        )
    #enc2 = TensorImage(enc2)
    #enc2 = self.encoder(x2)
    print("finished enc1, enc2")
    print(enc1.shape, enc2.shape)
    ftrs = torch.cat([enc1, enc2], dim=1)
    return self.head(ftrs)

def loss_func(out, targ):
    #return CrossEntropyLossFlat()(out, targ.long())
    print("loss function")
    print(out.shape)
    print(targ.shape)
    return CrossEntropyLossFlat()(out, targ.long())

The error occures on the concatenation of the two inputs:

no implementation found for 'torch.cat' on types that implement __torch_function__: [TensorImage, TensorMask]

Both fastai classes (TensorImage and TensorMask) inherit from Tensor class. So I do not really understand why I cannot invoke torch.cat ?

I kind of solved this by casting both inputs into TensorBase object.

class ProstateCancerModel(Module):
  def __init__(self, encoder, head):
    self.encoder, self.head = encoder, head

  def forward(self, x1, x2):
    enc1 = self.encoder(x1)
    enc2 =  self.encoder(
                          torch.stack((x2.unsqueeze(1), x2.unsqueeze(1), x2.unsqueeze(1)), dim=1).squeeze().float()
                        )
    #enc2 = TensorImage(enc2)
    #enc2 = self.encoder(x2)
    print("finished enc1, enc2")
    print(enc1.shape, enc2.shape)
    enc1 = TensorBase(enc1)
    enc2 = TensorBase(enc2)
    ftrs = torch.cat([enc1, enc2], dim=1)
    return self.head(ftrs)

But then I ran into a different error:

RuntimeError: CUDA error: device-side assert triggered

I suspect the loss-function is wrong?!

The CUDA error: device-side assert triggered is cryptic. In my experience, it is often caused somewhere in the loss function.
Try running the model on CPU only. This will give you a more meaningful error message.

You could also try to create two ImageBlocks instead of a MaskBlock and an ImageBlock. The model won’t care what Tensor subclass you give it, but handling the DataLoader and loss functions might get easier.

I managed to get the model working by changing the input blocks into two imageblocks and passing the image-mask as image into the model.

I have then run the nb on kaggle on the full data set but it did more than terrible…

https://www.kaggle.com/wemakeai/prostate-cancer

The CohenKappa Score stays at zero and validationa accuracy is getting worse and worse… any ideas on how to improve?

1 Like