Training Resnet with Gray scale

I have question…
I want to train Resnet34 using Gray scale images .How i can do that…
The way i can think of is

  1. Modify the input channel from 3 to 1 in the head block of resnet34

If this is the way then i dont know which FAI file should i modify to do that or is there any other way available to do so
.Please suggest the right way and approach…

You could to that, but until you have found the ideal net then is easier to make your own open_image that replicate the gray channel to 3. Then you try out more networks before you settle one the ideal net

def open_image_16bit2rgb( fn ):
a = np.asarray(PIL.Image.open( fn ))
a = np.expand_dims(a,axis=2)
a = np.repeat(a, 3, axis=2)
return Image( pil2tensor(a, np.float32 ).div(65535) )

vision.data.open_image = open_image_16bit2rgb

2 Likes

i adopted a similar method of fake rgb but i observe that model is not getting converged with ,just circling around some loss.
Instead i used following method

  1. modified imagnetstats from list of 3 items to list of 1 item,as broadcasting would fail while normalizing in transform

  2. reshapped image to a single channel output ,channel last.Channel order transformation then rearranges it to channel first.
    x = np.rollaxis(x.reshape(x.shape[0],x.shape[1],1), 2).
    we do cv2.resize in some transformation which wipes out he output channel so we do reshape to complete the channel first rearrangement.

  3. created custom Resnet ,all same with just header input changed from 3 to 1

This is in old fai. Yet to see the methods/libraries in new one…

Not sure if new one has much simpler method
May be Jeramy can throw some better idea here :slight_smile:
Also usefulness of using gray scale images in segmention problems

Can you share your notebook? Yes how could this be done in new would like to see as well. I am working with Salt Segmentation Dataset

I’ve had success just saving arrays as 8 bit single channel images.

Example of an image I am using:

That is a dropbox link so you should be able to download the raw image file.

Basically I take my numpy array of 16 bit unsigned ints and normalize them down to a 8 bit unsigned it.
array = (array - array.mean()) / array.std()
array = (array + 1) / 2 * 255
array = np.clip(array, 0, 255).astype(np.uint8)

I saved all my images like that and then just put them into a data bunch just as if they are color images. I don’t add 2 more channels or anything. just shove em in as is…

what is dimension of your image i.e shape after doing above transformations

Its the same shape as my original array which was 200 columns by 1081 rows.

sorry i dint get it right…
model accepts as
3,x,y
gray scale without mimicking it to rgb would be 1,x,y
so what was you final dim that was going to the resnet head…

I don’t know what databunch is doing but I am saving single channel png files and creating a databunch with those images.

I am not creating the 2nd or 3rd channel. I believe the png is greyscale and only 1 channel.