Using vgg with greyscale images


(Angel) #1

I am in lesson 2 and trying to apply the finetune of the vgg model to other datasets. I have tried with the Statefarm driver competition and it works relatively well (0.92 val acc).
But when I try with a dataset that has greyscale images (https://www.kaggle.com/c/datasciencebowl) I only reach 0.34 acc. Is there any preproccesing needed with a greyscale dataset?


CNN for Grey Images
(Jeremy Howard) #2

Yes, a color network isn’t necessarily optimal for a BW image. What accuracy did the leaders in the competition get? Have you tried fine tuning more layers?

PS: This is a great article for this comp: http://benanne.github.io/2015/03/17/plankton.html


#3

Hi, I am trying to apply the network to BW images also. Have you found a way to preprocess them? :slight_smile:


(Sean Lanning) #4

In one of the videos you retrained VGG on Imagenet for the sake of Batch Normalization. Could I in theory convert all photos of Imagenet to grayscale, and then create a grayscale version of VGG16BN?


(Angel) #5

I have managed to make some progress. Look into lesson 3 on the MNIST example (and on the notebook in the git repository) on how to retrain a (simpler) network from scratch using greyscale. My problem was that the loss increased a lot, I think it was a problem with choosing the correct learning rate (going now through lesson 4 where this is explained)


(Nafiz Hamid) #6

Hi, I am also trying to work through the data science bowl challenge. How are you dealing with the fact that each folder of images is a single data point rather than a single image?


(s.s.o) #7

I think you can not directly use gray images. The input layer is something like (3, 224, 224) the 3 represents the R, G and B channels of the image and from the gray images it should be something like (1, 224,224). May be converting gray image to RGB helps not sure though. If you have open cv installed you may try:

import cv2
import cv
color_img = cv2.cvtColor(gray_img, cv.CV_GRAY2RGB)


(Christine) #8

Has anyone gone ahead and tried training VGG (or another network, but now with 1 channel input layer) on a grayscale version of ImageNet? I’m curious if this might then transfer to looking at X-Ray or CT images. I tried searching online, but I didn’t see any obvious leads (this old forum topic is at the top of the list).


(Sean Lanning) #9

I haven’t see anyone try this. The color channel seems important for differentiating between a lot of the images in Imagenet. I tried a variation of this in that I trained a network on MNIST (which is 60k images in grayscale) and then tried transferring over the pre-trained results on a different task. In my experience it was not worth the effort, but perhaps because I had sufficient data. I think what probably makes more sense is converting your grayscale images to RGB and then using the pretrained imagenet weights as normal. The farther your images look from imagenet the less the pretrained networks weights seem to have value though. An alternative approach would be to take a parallel task that has a lot of data that looks more similar to the features of your data. For example an old Kaggle Competition on X-Rays or CT images probably has pretrained weights and networks lying around through github and their forums.