I’m trying to use ResNet34 (pretrained) with a dataset containing 6-channel images. I modified the model to take as input these images but I’m not really sure how to normalize them. Do I have to apply the ResNet stats to the first 3 channels and then to the last 3?
What are the 6 channels representing?
In order to get resnet weights to work I believe you are going to have to find a way to reduce the number of channels from 6 to 3. It is possible to make a neural net that works with 6 channels, but that is more of a part 2 topic. You would then apply the resnet stats to the 6-to-3 channel images as normal. Feel free to experiment with this too see what works best, as that is also a learning experience.
Interesting approach @marii - I usually recompute stats for all channels and normalize the training set as usual during databunch creation with
.normalize(). I agree with you about the importance of the “nature” of images.
You can leverage the imagenet normalization only if you’re going to use “similar” images (same number of channels, similar luminance and a wide variety of subjects).
If you’re changing the number of channels, or the subject (ie: training on charts), you need to normalize the data according to your distribution.
You can think pretrained weights as a very good initializaiton for your network; so, especially when you’ll unfreeze the first layer, it’s important that you train your network with normalized images.
Moreover (on resnet architectures), the effect of input normalization matters more on the first layer because after that there is a BatchNorm2d that balance the output after the first convolution:
Sequential( (0): Sequential( (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (2): ReLU(inplace) (3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False) ...
Train on MNIST with bad normalization:
bad_stats = ([.01,.3,.9], [.7,.01,.73]) # random values data = ImageDataBunch.from_folder(untar_data(URLs.MNIST_SAMPLE),bs=128).normalize(bad_stats) actsh = partial(ActivationsHistogram,liveChart=False,modulesId=range(6),hMin=-10,hMax=10,nBins=200) learn = cnn_learner(data, models.resnet18, callback_fns=actsh, metrics=[accuracy]) learn.unfreeze() learn.fit_one_cycle(4) learn.activations_histogram.plotActsHist(cols=6,figsize=(20,2),showEpochs=False)
Train on MNIST with auto normalization:
data = ImageDataBunch.from_folder(untar_data(URLs.MNIST_SAMPLE),bs=128).normalize()
As you can see, even if the output of the first convolution is so different, after BatchNorm2d the activations back to normality with or without input normalization.
NOTE: When you change the model to accept a number of channels different than 3, you’ll reset the weights for that layer (usually the first one); so It’s useful to “initialize the weight” with something meaningful (ie: copying from other channels RGB -> RGBRGB) before to unfreeze them.
See def adapt_first_layer(src_model, nChannels).
Would you explain how you modified the pretrained resnet to take 6 channels instead of 3? That seems to be a much harder issue than normalization. And it would influence how to best normalize.
@ste I was trying to find a way to get it to work with introducing as few new concepts in deep learning as possible. The person who asked seems to currently be in part 1 of the course. My recommendation does hinge on the images being at least “similar to imagenet”.
That’s pretty clever!
I have a kernel over here that goes over how to work with 6-channel images in fastai: