Pixel wise classification resources

I have a dataset with 360 medical images in 2 classes, each containing 3 channels. I also have the pixel wise labels for each image. Now, I wish to perform pixel wise classification for the images. Can anyone please point me to some useful resources?

The technical term for this is “segmentation”, which comes in two flavors: semantic segmentation, where we just care what class each pixel belongs to, and instance segmentation where we also care what object each pixel belongs to.

For medical images, U-Net is a good place to start.

1 Like

Can you point me towards some code sample to use my own dataset with unet? Also, what role do trained models like PASCAL VOC and MS COCO play in this regard?

Here is an example of U-Net (using Keras) on microscopy images: https://www.kaggle.com/keegil/keras-u-net-starter-lb-0-277

Pascal VOC and MS COCO are not trained models but datasets that can be used to train models, including segmentation models. But if your data is medical in nature, then VOC or COCO aren’t going to be very useful.

Thanks a lot for the help.

One quick question. The images do not have any fixed number of segments. Will it be possible to run u-net on this?

If your image is 224x224 pixels (or whatever) then U-Net outputs a 224x224 score map that contains the probability of each pixel being one of your classes. So if you have 5 possible classes, this score map is 224x224x5. You can then take the argmax for each pixel to find the actual class for that pixel.

It’s helpful to read the U-Net paper, I would recommend it.

Lesson 14, Segmentation.

Hi there,

There is an existing open-source deep learning project that specifically deals with medical imaging. It’s called NiftyNet and is built on tensorflow.

It’s currently used by several groups for research in deep learning, but is made to facilitate ease-of-use for beginners as well. Everything in an experiment can be controlled by a config.ini file.

There is an example of how to use u-net at this address: https://github.com/NifTK/NiftyNet/blob/dev/demos/unet/U-Net_Demo.ipynb , which contains information both on NiftyNet and the U-Net architecture.

Disclaimer: I am a developer for NiftyNet, but we’d love it if you gave it a try and were able to use it!

1 Like

I don’t think that’s how U-Net works. In the paper the input shape is 572x572x1 and the output shape is 388x388x2. i.e. they are not the same.

I’m trying to train it using Keras, but I’m getting an error because “Error when checking target: expected class_output to have shape (324, 324, 4) but got array with shape (512, 512, 4)”, when I set the input shape to be 512x512 and the number of classes to be 4.

You are referring to a particular implementation. In general you can design U-net to output anything you want. Original paper authors had their own reasons why they squeezed the output size.

Your Keras code: make sure you have appropriate padding. Switching between valid and same padding determines the output size.

oh, that’s good to know, thanks. I switched to same padding to make my input/output the same, which seemed to make keras happy.

I’m having issues trying to train my unet implementation. The loss just bounces around the same value and never seems to learn anything. I’m not sure if the way I prepare data is right, or if the loss function I’m using is appropriate. What’s a good way of debugging whats going wrong?

Nice! You have everything in place, good work!
Things to try:

  1. Learning rate too high, try 0.001 or 0.0001. If everything is setup correctly then too high initial learning rate doesn’t converge, gives bumpy loss instead.

  2. Loss function: try categorical crossentropy. Later try dice loss or combine both etc.

  3. Coco: I confused coco with pascal, so deleted my initial comment here.

Hmm, I tried lower the learning rate to 0.0001 and the momentum to 0.9, changing the optimizer from SGD to Adam, and the loss from jaccard to categorical_crossentropy, and increased batch size to 4, but it just can’t seem to learn. I also tried normalizing the image data by dividing by 255.

After 10 epochs (100 batches each) loss dropped from 1.3534 to 1.0627, but categorical_accuracy stayed the same at 0.6556, which is about the percentage of background pixels. So it looks like it’s just guessing “background” for everything.

I’m just using the defaults for weight initializing, so glorot_uniform and zeros for bias’.

Anything else you think I could try? Thanks again for your help.