Influence of image channels on classification accuracy

marcossantana · October 7, 2019, 7:05pm

I entered the Human Protein Atlas Image competition on Kaggle to practice what I learned so far in Part 1. The competition seemed straight forward and similar to Lesson 3 planet notebook. We have 28 labels for each image corresponding to the cellular location of a particular protein.
The problem is that each sample is split into 4 files, one for each channel of the image. In addition, the evaluation section says only the green channel is used for prediction. Based on this I trained a model using only the files corresponding to the green channel and voilà: my F1 score was extremely low.
So my question is: if we are only going to use one channel for prediction, why can’t we train a cnn_learner using only green channel files? I plotted the images for other channels and they do show different properties, which makes me think that I probably not considering extra information they could give to improve the model.

Link to competition: Human Protein Atlas Image Competition

VishnuSubramanian · October 8, 2019, 5:17am

If you want to use pretrained weights, image is expected to have 3 channels. If its true that only channel is being used for prediction then you can copy the same channel data to the other 2 channels and use the pretrained network. If you are not planning to use pretrained weights then you can just the required channel.