Kaggle Data Science Bowl 2018 : Find and segment nuclei

The Kaggle Data Science Bowl 2018 was announced today:

The goal is to identify and segment cells nuclei from microscopic images. Clinically, nucleus size and Nucleus-to-Cytoplasm ratio are important features used by pathologists to classify many types of cancer. This is also an important microscopic feature for research to potentially quantify drug (chimiotherapy) responses.

I hope this healthcare challenge will interest many fast.ai students!


Hi all, has anyone has tried to implement image segmentation using the fastai framework?

1 Like

No, I am currently using a Mask RCNN model.


Congrats on your score so far in this challenging competition!

I am also interested in implementing a Mask RCNN model but don’t have any experience using this before. Do you have any suggestions (tutorials etc) for how to get started with it?

Hopefully Jeremy has plans for adding something like this to the fastai library! Or perhaps it will be covered in Part 2 v2 :slight_smile: (cc @jeremy)


Thx James. I started initially from scratch using the Keras-RCNN extension that the current leader recommended (https://github.com/broadinstitute/keras-rcnn) It helped me a lot to understand the entire architecture.

But after 2-3 days, I decided to switch to this keras/tf implementation with decent documentation that is working very well : https://github.com/matterport/Mask_RCNN

That is the current backbone of my pipeline. Of course you need to adapt for the current dataset and like Allen the current leader said, there is a lot of post-processing needed to get decent results from Mask-RCNN. Probably with the same trained model, different post-processing pipelines can probably give variable results from 0.2 to 0.63. But Mask-RCNN, being an instance segmentation model, definitely allows better results than Unet that is a semantic segmentation model.

FAIR released officially and publicly their Caffe2 version about 2 weeks ago. I didn’t really look into it but technically this is probably the most efficient (training and inference) implementation : https://github.com/facebookresearch/Detectron


Thank you Alexandre!

This repo looks awesome, very well documented, much appreciated for the suggestion!!


But after 2-3 days, I decided to switch to this keras/tf implementation with decent documentation that is working very well : https://github.com/matterport/Mask_RCNN4

Hi Alexandre, thx for the repo. I ran the demo notebook and encountered issues when creating MaskRCNN model, ie “Failed to convert object of type <class ‘theano.tensor.var.TensorVariable’> to Tensor”. Have you had this problem?

I don’t remember having this problem but I am using tensorflow as the backend for keras. It looks like you are probably using theano as the backend for keras. The readme says it is an implementation for Python 3, Keras, and TensorFlow. Parts of the code is using native tensorflow. I hope it helps.

Hi Alexandre,

congrats on your score in the competition.
I am struggling to get Mask RCNN working on the nuclei data.

Do you have any pointers/hints to get the “matterport” code running with nuclei data?
(The provided .ipynb is probably one step too complex)

Would be great!

I would recommend to start by looking at the shape demo and adapt the code (Mostly config and Dataset classes) for your nuclei data. Implementing the class functions load_image and load_masks is part of the initial job.

1 Like

Hi Alexandre,

thanks for your hint. I managed to get the following working, but when I start the training, the process just hangs. Can I ask you to have a quick look at the following code in case you find any obvious mistakes?

dataset_train_dsb = DsbDataset()

and then

class DsbDataset(utils.Dataset):
 def load_data(self, train_df):
    # Add classe
    self.add_class("nucleus", 1, "nucleus")
    for i, filename in tqdm.tqdm(enumerate(train_df['image_path']), total=len(train_df)):
        #print("reading file: ",filename)
        self.add_image("nucleus",image_id=i, path=train_df['image_path'].loc[i],

   def load_mask(self,image_id):
      return super(DsbDataset, self).load_mask(image_id)

Is that similar to how you adapted your code?

Thanks a lot for your help, Alexandre!

Hi @fabsta ,

I also am using this Mask-RCNN model and have it working with Kaggle’s nuclei dataset.

The thing that caught my eye in your code is the way you have implemented def load_mask here. Note that the default for this method simply loads an empty mask so you’ll need to re-write it to override the default so that it actually loads the masks from the nuclei dataset (including assigning the class_ids).

Hope this helps!


@fabsta, I agree with @jamesrequa . You need to load the masks by yourself in the inherited load_mask function. The load_mask function is called dynamically by the training generator.

Thanks, @jamesrequa and @alexandrecc!
That’s really helpful!

Thanks for creating this thread I am also participating in the competition but still at the phase of exploring Unet :slight_smile: To what extent are you using fastai library ? I want to use as much as I can especially for training loops (learner) .

Also, I’ve observed (of course not a big surprise) that running having different modality images makes training more unstable. How did you tackle this problem. I am thinking of clustering and then running different models for different images, would it be too much ?

Good luck to everyone

1 Like

I’ve started the competition as well. Thanks for creating this thread.

I have one question though. On Kaggle, most of the Kernels there, for this competition, use UNet. I wanted to ask why is that? How is UNet better than Mask-RCNN?

Another thing. Is Mask-RCNN, or other RCNN based architectures, better than alternatives like YOLO and SSD? I suspect that the answer might be that they both have separate use cases. In that case, when to use which? Is it feasible to use them in this competition?


Yolo and ssd are doing “detection” using a bounding box: using two corners of a rectangle, they are able to say approximately where an object is.
In this competition, you need to be more precise and find the exact areas (not just an approximate bbox) which is called a segmentation.
Unet does traditionally sementic segmentation which means that it will find the areas of interest but won’t be able to separate them. People have been trying to find center and contours using unet in order to be able to separate the centroids.
Mask RCNN does instance segmentation which means that it is able to differentiate the centroids without any tricks. Unfortunately, the structure of mask rcnn is much more complex than Unet and the part of mask rcnn doing segmentation is not as powerful as Unet so it make Unet a good contender in this competition.

Averaging/ ensemblig results in this kind of competition isn’t as easy as other traditional competition as the output is a series of masks (images).


I agree with @eagle4

Unet is probably used by the majority of participants because it is a simpler model.

Mask-RCNN is probably used by the majority of the top 100 participants because it is a better model for the competition metric.

If 2 nuclei overlap, U-Net will count them as 1 large object (semantic segmentation). You’ll basically receive a score of 0/2 objects detected because none of the 2 ground truth objects will really be similar to the combined large predicted object.

Mask-RCNN will detect both objects (instance detector and segmentation) so you have a potential to get a 2/2 score.

And when a single object is detected with a high confidence and high sensitivity, then generating the mask/segmentation is pretty straightforward.


Thanks @eagle4 and @alexandrecc. That’s was quite helpful. Much appreciated :smile:

For example a unet model, we observe that overlap problem here:

image as : real image, real mask, predicted mask