SIIM-ACR Pneumothorax Segmentation - Image Segmentation List

Hi! I am just trying to solve the " SIIM-ACR Pneumothorax Segmentation" competition on kaggle. There you get DICOM medical images with Segmentation Lists.

I have created png images from the pixel-arrays of the DICOM images and would like to create my ImageData. I’m trying an analogous approach to the lesson3-camvid notebook:

I have stored the rle-masks in a dataframe:

-1 means there are no annotated pixels in this image.

Now I would like to create the SegmentationItemList:

src = (SegmentationItemList.from_df(df, path_jpgs, convert_mode='grey'))

This works fine. But creating my data objects returns an error:

data = (src.transform(get_transforms(), size=size, tfm_y=True)

AttributeError                            Traceback (most recent call last)
<ipython-input-12-49a6d5cb5840> in <module>
----> 1 data = (src.transform(get_transforms(), size=size, tfm_y=True)
      2         .databunch(bs=bs)
      3         .normalize(imagenet_stats))           

AttributeError: 'SegmentationItemList' object has no attribute 'transform'

What am I doing wrong?

I may be wrong, but looks like you’re missing 3rd datablock API step - How to label the inputs?. I think it should be something like .label_from_func(get_y_fn, classes=codes).

Could you possibly share fastai starter for the competition on kaggle?

1 Like

You need to split this into a validation set and I think add labels before you can transform.

If you do src. And then press Tab it should help


If you have further errors you can check my fastai starter code:

I wonder if there is a better way of incorporating the test set. I was getting an error due to the tfm_y=True on the test set (that has no masks). As a workaround, I changed the transform function itself to set tfm_y=False to the test set.

# Setting transformations on masks to False on test set
def transform(self, tfms:Optional[Tuple[TfmList,TfmList]]=(None,None), **kwargs):
    if not tfms: tfms=(None,None)
    assert is_listy(tfms) and len(tfms) == 2
    self.train.transform(tfms[0], **kwargs)
    self.valid.transform(tfms[1], **kwargs)
    kwargs['tfm_y'] = False # Test data has no labels
    if self.test: self.test.transform(tfms[1], **kwargs)
    return self
fastai.data_block.ItemLists.transform = transform

Thank you for the starter code :smiley:

@mnpinto Thanks for sharing your kernel!

I think I have difficulties understanding how to work with rle encoded masks. Is there way to directly use the rle-encodings from the csv file or do I have to generated the masked jpgs anyway?

When or how do I use the open_mask_rle() function:

1 Like

Initially I tried to use the rle-encodings directly but I didn’t find a easy way of making it work so I ended up converting the masks to png (avoid jpgs as the compression can lead to problems). I used the rle2mask provided on competition data. I shared the code I used in the comments of the kernel I linked above.

So, first kaggle competition… Got rank in the top 100. Training with image size 64x64.
I have also tried to do transfer learning and retrain with higher resolution images in 512x512 but this resulted in lower total score.
Any ideas how to push further?

Well, with more competitors joining the competition, my rank sank and I am out of the top 100. Nevertheless, this is the goal, I am striving for.

So I tried different things:

  • Training the last layer with higher resolution images (128x128 and 256x256)
  • unfreezing and retraining
  • different transformation parameters on the databunch
  • Training more epochs (~20-30)
  • Based on the 64x64 I have changed the data to a 256x256 data bunch and retrained the last layer.
  • using a resnet50 and resnet101 as base model

Interesting enough, none of the above approaches got a higher score! The first very basic basic attempt scored: 0.7976
Any other attempt scored with 0.78**

Do you have any ideas what to do to get a higher score?

1 Like

But you should avoid discussing the comp here…

… why? I am not sure if there are many teams on kaggle-forums, using library

Just a quick remark on RLE convention.

It seems to me that fastai doesn’t use the same convention as the SIIM-ACR competition. For example,

from import open_mask_rle
import matplotlib.pyplot as plt
rle = '1 1 22 2 43 3 64 4 85 5'
mask = open_mask_rle(rle, shape=(20, 10))
fig, ax = plt.subplots(figsize=(6, 6))

produces an image like this:
fastai-rle-mask-convention-example fastai-rle-mask-convention-example.png

That is, if the rle is x[1] l[1] x[2] l[2] x[3] l[3] ..., then block iof 1s starts at positionx[i] and runs for l[i].

But the SIIM-ACR uses a relative position scheme, and here is the example they give:

For example, ‘1 3 10 5’ implies pixels 1,2,3 are to be included in the mask, as well as 14,15,16,17,18

I think discussing solution may be against the rules.
Good heuristics in general is to find other people’s notebooks on similar problem and see what they did.