Kaggle Data Science Bowl 2018 : Find and segment nuclei

kcturgutlu · February 18, 2018, 11:03am

What functions are you using to construct the final predictions data, e.g. a row for each nucleus? I found np.label to be inconsistent when compared on training data mask and labels…

pedrol · February 19, 2018, 6:49pm

@alexandrecc Hi, you could say which tool you used to make the segmentation labels, I could not start using https://github.com/matterport/Mask_RCNN because I do not have a program to create the annotations,
could you also tell me how many minimum data do you need for the pre-trained model ???
Thank you very much!!!

rdebbe · February 19, 2018, 9:49pm

Hi Alexandre, I’m also in the DSB2018 competition and I’m trying to use Mask-RCNN because I’m unable to improve my U-Net results.
I think I have made the appropriate changes to config and Dataset, I have my own version of utils.load_mask which returns masks with shape (h,w, number of masks in sample) and class_id as a single entry list because all entities in a sample are nuclei or background.
I work on a mac 10.13.2 with TF 1.4.0 and keras 2.0.9
I use a notebook based on the shape.py from matterport/Mask_RCNN repository. Unfortunately at train time keras gives me a fatal error "StopIteration:if not hasattr(generator_output, ‘len’):"
I checked the output of train and valid generators and see that it correctly deliver 7 objects. I think I’m reaching the limit of my debugging abilities.
Can you or somebody in the forum suggest something else I should check.
Thanks

alexandrecc · February 19, 2018, 10:12pm

@pedrol You don’t have to create the annotations to start training your model for DSB2018. The Data science bowl has a training dataset with images and masks. If your question is not concerning the DSB2018, and you want to apply Mask-RCNN for a new problem then you are right that you need to create annotations (masks) if you don`t have them in your own dataset. If you want to use natural images, you can just try to predict from a model trained with Coco dataset. If your question is specific to Mask-RCNN and not DSB2018, I would recommend to create another thread.

@rdebbe Can you send the entire error stack ? This could help to see what is the problem. The generated masks need to be structured as a (H,W,N) np.array where N is the number of masks. You also need a 1xN np.array for class_id, one class per mask (basically just a 1 for all masks since we only have segmented nuclei)

rdebbe · February 19, 2018, 10:16pm

Here is the stack:

Blockquote
Epoch 1/1
1/5 [=====>…] - ETA: 1:49 - loss: 8.4630 - rpn_class_loss: 1.2581 - rpn_bbox_loss: 1.8363 - mrcnn_class_loss: 3.3445 - mrcnn_bbox_loss: 0.9149 - mrcnn_mask_loss: 1.1091

StopIteration Traceback (most recent call last)
in ()
6 learning_rate=config.LEARNING_RATE,
7 epochs=1,
----> 8 layers=‘heads’)

/Users/debbe/kaggle/DataScienceBowl2018/model.py in train(self, train_dataset, val_dataset, learning_rate, epochs, layers)
2203 max_queue_size=100,
2204 workers=workers,
→ 2205 use_multiprocessing=True,
2206 )
2207 self.epoch = max(self.epoch, epochs)

//anaconda/lib/python3.5/site-packages/keras/legacy/interfaces.py in wrapper(*args, **kwargs)
85 warnings.warn('Update your ' + object_name + 86 ' call to the Keras 2 API: ’ + signature, stacklevel=2)
—> 87 return func(*args, **kwargs)
88 wrapper._original_function = func
89 return wrapper

//anaconda/lib/python3.5/site-packages/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
2044 batch_index = 0
2045 while steps_done < steps_per_epoch:
→ 2046 generator_output = next(output_generator)
2047
2048 if not hasattr(generator_output, ‘len’):

StopIteration:

Sorry if the format is messed up
Thanks
Ramiro

rdebbe · February 20, 2018, 8:59pm

Hi, It looks like I found a way out of my problem. The information I posted yesterday was obscure and useless to me and others. The real meaty stuff went to the terminal running jupyter-notebook, which I never look at it.
The message was telling me that too many of the bounding boxes constructed from individual mask were empty, and the method ‘minimize_mask’ was raising exceptions. All I saw in my jupyter display was the complain from the batch generator because it produced and empty output and stopped the training.
Today I learned that the empty bounding boxes appear in high resolution images. (I was using the shape (128,128) as input to Mask-RCNN). I think small nuclei can ‘disappear’ if the resizing is not careful.
Changing the size to (256, 256) unable me to process 150 steps with only a few exceptions from ‘minimize_mask’, which did not stop the training.
I apologize for panicking and spamming you with my problems, I thank you though for your reply, it looks like you are a nice community.
Ramiro

alexandrecc · February 21, 2018, 2:24am

That is great that you found the problem. It is probably also related to the mini masks used by default during training. The masks are resized by default to 56x56 to lower the memory burden. So when the masks are already small in a high resolution image, the bounding box and the mask can be of 0 size when resizing.

rdebbe · February 21, 2018, 3:05am

Hi Alexandre,
I’m back in production mode. I’m using (512, 512) as input to the model and I do not see any more messages about 0 area bboxes. It takes 1.5 hours to process 537 training samples and it only used 4 GB of CPU memory.
My next job is to understand the outputs of the model in inference mode.
From what I have seen in the code, this is a complicated model, I’m glad I can see indications of good predictions after the first epoch.
Ramiro

wolfpassion20 · February 23, 2018, 3:49am

Hi Alexandre,

I’m also trying to use Matterport’s Mask R-CNN.

I already subclassed the Config class, but now its just the tricky Dataset class. I have the following code, was was hoping you could point out where this code goes wrong. The data structure, for the original training images, is: ‘root_path/train_images/images.png’. For the masks, it is: ‘root_path/masks/mask_images.png’.

class NucleiDataset(utils.Dataset):
        def load_nuclei(self, root_path, mode='train', filter_ids=None):
            self.add_class("nucleos", 1, "nucleo")
            if mode == 'test':
                files = os.listdir(os.path.join(root_path, 'test_images'))
            else:
                files = os.listdir(os.path.join(root_path, 'train_images'))

            if filter_ids is not None:
                files = [item for item in files if item.split('.')[0] in filter_ids]
                
            for i, ffile in enumerate(tqdm(files)):
               if mode == 'test':
                   data_path = os.path.join(root_path, 'test_images', ffile)
                   mask_path = 'No masks'
               else:
                   data_path = os.path.join(root_path, 'train_images', ffile)


            bg_color = random.randint(0, 255)
            original_id = data_path.split('/')[-1].split('.')[-2]
            # TODO change later maybe

            data = imread(data_path)[:, :, 0]
            height, width = data.shape

            self.add_image("nucleos", image_id=i, path=data_path,
                           width=width, height=height,
                           bg_color=bg_color, original_id=original_id,
                           data_path=data_path,
                           root_path=root_path)

            image = imread(data_path)
            # If grayscale. Convert to RGB for consistency.
            if image.ndim != 3:
                image = skimage.color.gray2rgb(image)
        return data

    def load_image(self, image_id):
        info = self.get_info(image_id)
        path = info['data_path']
        image = skimage.io.imread(path)
        # If grayscale. Convert to RGB for consistency.
        if image.ndim != 3:
            image = skimage.color.gray2rgb(image)
        return image

    def load_mask(self, image_id):
        info = self.get_info(image_id)
        original_id = info['original_id']
        root_path = info['root_path']
        width = info['width']
        height = info['height']
        return self.add_mask_data(original_id, root_path, width=width, height=height)[:]

    def get_info(self, image_id):
        return self.image_info[image_id]

    def add_mask_data(self, original_id, root_path, width=256, height=256):
        # info = self.image_info[image_id]
        orifinal_id = original_id
        root_path = root_path
        all_masks = os.listdir(os.path.join(root_path, 'masks'))
        all_masks = [element for element in all_masks if element.split('_')[0] == orifinal_id]


        num_labels = 0
        for i, mask_file in enumerate(all_masks):
            subpath = os.path.join(root_path, 'masks', mask_file)
            data = imread(subpath)[:, :]

            if np.sum(data) != 0:
                num_labels += 1

        mask = np.zeros([height,
                         width,
                         num_labels],
                        dtype=np.bool)
        num_labels = 0
        for i, mask_file in enumerate(all_masks):
            subpath = os.path.join(root_path, 'masks', mask_file)
            data = imread(subpath)[:, :]
        
            if np.sum(data) != 0:
                data = (data != 0)
                mask[:, :, num_labels] = data
                num_labels += 1
    
        class_ids = np.array([1] * num_labels)
        class_ids = class_ids.astype(np.int32)
        return mask, class_ids

I run model with the following code:

# Training dataset
dataset_train = NucleiDataset()
dataset_train.load_nuclei(ROOT_DIR, mode='train',filter_ids = None )
dataset_train.prepare()

# Validation dataset
dataset_val = NucleiDataset()
dataset_val.load_nuclei(ROOT_DIR, mode = 'train', filter_ids = None )
dataset_val.prepare()

Yes, I still need to figure out how to use the ‘filter_ids’ argument to split the training set into validation set, but I figured it would run as it is on an AWS Tesla K80, but the code just hangs in the cell (the asterisk in the upper left in the jupyter notebook remains there forever).

hel0 · February 23, 2018, 3:50am

Yes I changed to tf and it’s working

I am not sure about the concept of input and mask dimensions. I trained with IMAGE_SHAPE [512 512 3] MINI_MASK_SHAPE (56, 56), for some reason the ETA on my output was stuck at 2 hours. So I retrained with config IMAGE_SHAPE [256 256 3] and MINI_MASK_SHAPE (56, 56). In this setting, when I got “Invalid bounding box with area of zero” the input image is relatively big like ‘width’: 603, ‘height’: 1272. Is there a ratio that has to be maintained between IMAGE_SHAPE and MINI_MASK_SHAPE? Would dropping mini mask help?

Also, I got an exception Traceback after 92/100, does that mean the training was imcomplete?

Exception: Invalid bounding box with area of zero
** 92/100 [==========================>…] - ETA: 3:23 - loss: 2.8444 - rpn_class_loss: 0.4697 - rpn_bbox_loss: 1.0291 - mrcnn_class_loss: 0.3521 - mrcnn_bbox_loss: 0.5193 - mrcnn_mask_loss: 0.4742**
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
in ()
** 7 learning_rate=config.LEARNING_RATE,**
** 8 epochs=1,**
----> 9 layers=‘heads’)
~/anaconda2/envs/py3/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
684 if value.traceback is not tb:
685 raise value.with_traceback(tb)
→ 686 raise value
687
688 else:

Exception: Invalid bounding box with area of zero

hel0 · February 24, 2018, 3:53pm

I tried to make my first submission:

a) The submission expects 4152 prediction rows, however my prediction is only 1870 rows.
b) The submission returns 7 exceptions:
The same pixel may not be assigned to two different objects.
The same pixel may not be assigned to two different objects.
The same pixel may not be assigned to two different objects.
The same pixel may not be assigned to two different objects.
The same pixel may not be assigned to two different objects.
The same pixel may not be assigned to two different objects.
The same pixel may not be assigned to two different objects.

Could these be due to positive and negative ROI settings in the model or are there some config values that I’ve overlooked?

fabsta · February 26, 2018, 7:39pm

Hey Andrew,

I ran into the same problem as you. My code seems to hang in the training phase. Were you able to fix it?

To split into training and validation set, you could just move some training files into a validation directory
mv input_changed/train_images/0* input_changed_val/train_images/

Cheers,
Fabian

jamesrequa · February 26, 2018, 7:50pm

@hel0 The competition rules don’t allow for predicted instances to share any pixels, so each mask needs to have unique pixels from all other masks. Your current submission contains some overlapping masks (overlapping nuclei instances) so you’ll need to make sure to remove any overlapping pixels before re-submitting.

You don’t actually need to have 4152 prediction rows, so you can safely disregard this.

krishnavishalv · February 28, 2018, 10:02pm

Hey, @alexandrecc can you mention some post-processing methods apart from dense CRFs that can improve performance of the segmentation ? I have searched for other methods but I could only find CRFs.

alexandrecc · March 1, 2018, 3:36am

Examples for this task : Bounding box non-max suppression, Mask non-max suppression, Morphological dilation, Management of overlapping masks and pixels.

fabsta · March 1, 2018, 9:56pm

Thanks for the reply, Alexandre and thanks for bringing up the topic, krishinavishalv!

That’s really helpful. Just one more question: What is CRF?

sosssego · March 8, 2018, 4:08pm

Hi, I am also in the competition, I made some successful submissions using unet, based on a example kernel, my best score was 0.308, then I tried the matterport mask rcnn, I made some changes to the code and it runs, I only tested it with 128x128 images and only training on 100 images, the results looks good, but I still need to find some bugs while creating my submission file. I run it on my own machine with a gtx 1060 6gb, so the batch size need to be very small…
Would be great to see a mask rcnn implementation in fastai library.

krishnavishalv · March 8, 2018, 8:12pm

CRF means conditional random field, its a preprocessing technique used to enhance IoU score in segmentation tasks.

aadimator · March 9, 2018, 8:55am

I’ve been trying to understand the problem for the last 2 weeks but I couldn’t figure it out. Here’s the output I get:

ERROR:root:Error processing image {'id': '3934a094e8537841e973342c7f8880606f7a2712b14930340d6f6c2afe178c25', 'source': 'nuclei', 'path': 'E:\\workspace\\dl\\data-science-bowl-2018\\input/stage1_val/3934a094e8537841e973342c7f8880606f7a2712b14930340d6f6c2afe178c25/images/3934a094e8537841e973342c7f8880606f7a2712b14930340d6f6c2afe178c25.png', 'width': 320, 'height': 256, 'mask_shape': (256, 320, 65)}
Traceback (most recent call last):
  File "E:\workspace\dl\data-science-bowl-2018\MaskRCNN\model.py", line 1632, in data_generator
    use_mini_mask=config.USE_MINI_MASK)
  File "E:\workspace\dl\data-science-bowl-2018\MaskRCNN\model.py", line 1220, in load_image_gt
    mask = utils.minimize_mask(bbox, mask, config.MINI_MASK_SHAPE)
  File "E:\workspace\dl\data-science-bowl-2018\MaskRCNN\utils.py", line 462, in minimize_mask
    raise Exception("Invalid bounding box with area of zero")
Exception: Invalid bounding box with area of zero
ERROR:root:Error processing image {'id': '4327d27591871e9c8d317071a390d1b3dcedad05a9746175b005c41ea0d797b2', 'source': 'nuclei', 'path': 'E:\\workspace\\dl\\data-science-bowl-2018\\input/stage1_val/4327d27591871e9c8d317071a390d1b3dcedad05a9746175b005c41ea0d797b2/images/4327d27591871e9c8d317071a390d1b3dcedad05a9746175b005c41ea0d797b2.png', 'width': 360, 'height': 360, 'mask_shape': (360, 360, 31)}
Traceback (most recent call last):
  File "E:\workspace\dl\data-science-bowl-2018\MaskRCNN\model.py", line 1632, in data_generator
    use_mini_mask=config.USE_MINI_MASK)
  File "E:\workspace\dl\data-science-bowl-2018\MaskRCNN\model.py", line 1220, in load_image_gt
    mask = utils.minimize_mask(bbox, mask, config.MINI_MASK_SHAPE)
  File "E:\workspace\dl\data-science-bowl-2018\MaskRCNN\utils.py", line 462, in minimize_mask
    raise Exception("Invalid bounding box with area of zero")
Exception                                 Traceback (most recent call last)
<ipython-input-106-83fb3ae74319> in <module>()
      6             learning_rate=config.LEARNING_RATE,
      7             epochs=1,
----> 8             layers='heads')

E:\workspace\dl\data-science-bowl-2018\MaskRCNN\model.py in train(self, train_dataset, val_dataset, learning_rate, epochs, layers)
   2232             steps_per_epoch=self.config.STEPS_PER_EPOCH,
   2233             callbacks=callbacks,
-> 2234             validation_data=next(val_generator),
   2235             validation_steps=self.config.VALIDATION_STEPS,
   2236             max_queue_size=100,

E:\workspace\dl\data-science-bowl-2018\MaskRCNN\model.py in data_generator(dataset, config, shuffle, augment, random_rois, batch_size, detection_targets)
   1630             image, image_meta, gt_class_ids, gt_boxes, gt_masks = \
   1631                 load_image_gt(dataset, config, image_id, augment=augment,
-> 1632                               use_mini_mask=config.USE_MINI_MASK)
   1633 
   1634             # Skip images that have no instances. This can happen in cases

E:\workspace\dl\data-science-bowl-2018\MaskRCNN\model.py in load_image_gt(dataset, config, image_id, augment, use_mini_mask)
   1218     # Resize masks to smaller size to reduce memory usage
   1219     if use_mini_mask:
-> 1220         mask = utils.minimize_mask(bbox, mask, config.MINI_MASK_SHAPE)
   1221 
   1222     # Image meta data

E:\workspace\dl\data-science-bowl-2018\MaskRCNN\utils.py in minimize_mask(bbox, mask, mini_shape)
    460         m = m[y1:y2, x1:x2]
    461         if m.size == 0:
--> 462             raise Exception("Invalid bounding box with area of zero")
    463         m = scipy.misc.imresize(m.astype(float), mini_shape, interp='bilinear')
    464         mini_mask[:, :, i] = np.where(m >= 128, 1, 0)

Exception: Invalid bounding box with area of zero

The mask shape refers to the shape of the mask when it’s initially read.

I’ve tried many configurations, changing MIN_IMAGE and MAX_IMAGE sizes. This is the configuration I am currently using

BACKBONE_SHAPES                [[128 128]
 [ 64  64]
 [ 32  32]
 [ 16  16]
 [  8   8]]
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     8
BBOX_STD_DEV                   [0.1 0.1 0.2 0.2]
DETECTION_MAX_INSTANCES        100
DETECTION_MIN_CONFIDENCE       0.7
DETECTION_NMS_THRESHOLD        0.3
GPU_COUNT                      1
IMAGES_PER_GPU                 8
IMAGE_MAX_DIM                  512
IMAGE_MIN_DIM                  512
IMAGE_PADDING                  True
IMAGE_SHAPE                    [512 512   3]
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.001
MASK_POOL_SIZE                 14
MASK_SHAPE                     [28, 28]
MAX_GT_INSTANCES               100
MEAN_PIXEL                     [123.7 116.8 103.9]
MINI_MASK_SHAPE                (56, 56)
NAME                           nuclei
NUM_CLASSES                    2
POOL_SIZE                      7
POST_NMS_ROIS_INFERENCE        1000
POST_NMS_ROIS_TRAINING         2000
ROI_POSITIVE_RATIO             0.33
RPN_ANCHOR_RATIOS              [0.5, 1, 2]
RPN_ANCHOR_SCALES              (32, 64, 128, 256, 512)
RPN_ANCHOR_STRIDE              1
RPN_BBOX_STD_DEV               [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD              0.7
RPN_TRAIN_ANCHORS_PER_IMAGE    256
STEPS_PER_EPOCH                1000
TRAIN_ROIS_PER_IMAGE           200
USE_MINI_MASK                  True
USE_RPN_ROIS                   True
VALIDATION_STEPS               50
WEIGHT_DECAY                   0.0001

Any help would be much appreciated. I’ve been stuck on this problem for the last 2 weeks, trying many fixes by looking at the forums and GitHub issues but nothing seems to be working for me.

alexandrecc · March 9, 2018, 1:04pm

@aadimator : During training, mini masks of 56x56 are created from the resized masks. Some masks are very small in high resolution images. In these cases, some mini masks have no positive pixels after the double resizing. Many different solutions:

Don’t use mini masks by setting this config to False
Exclude the non-working images from the training dataset
Use larger min_dim and max_dim image size
Use larger mini masks (ie 112x112)

Depending on the choice, it will likely alter training performance or results.