Feeding a gray scale image to resnet model

I am trying to perform real time emotion detection using opencv. I’ve trained the model and got an accuracy of 66%. The model was trained on gray scale images. Now I’m trying to integrate the model into a python script that utilizes opencv.

My problem is, is that the model is always predicting a neutral emotion, which is naturally incorrect. So i tried capturing a grey-scale image from the webcam and feeding it to the trained CNN and it got it correct. So it seems that I should be converting the images to gray-scale before passing them on to learn.predict. I have tried to do that by using opencv’s function gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY).

This results in an error as the model is expecting an image with 3 channels. RuntimeError: Given groups=1, weight of size 64 3 7 7, expected input[1, 1, 64, 64] to have 3 channels, but got 1 channels instead.

Is there way to convert the gray-scaled image to 3 channels or than the way I am doing it.
Here is the code for reference: https://pastebin.com/DFp5Vxd8

You can normalize from imagenet which will automatically make it three channel, or you can change the first layer in ResNet to be one channel by adjusting its input filters to 1 instead of 3

3 Likes

managed to solve it using np.stack((roi_resize,)*3, axis=-1)

1 Like