This thread is to discuss the pull request as per the subject line. See the comments at Add check to '_add_norm' by marib00 · Pull Request #3820 · fastai/fastai · GitHub for background.
The question is: how to best normalize greyscale (1-channel) images before feeding them into an ImageNet pretrained model?
I have run some experiments to try and quantify effects of normalization in transfer learning using imagenet_stats (0 in the tables below), actual_stats (for the dataset in question; 1 in the tables) and no normalization (2 in the tables). Notebook available here if anyone wants to play: Google Colab
For the PETS dataset it’s slightly better to normalize with imagenet_stats than actual_stats for both models (can’t really explain this - maybe actual_stats don’t reflect all the augmentations etc.?). Not using any normalization actually doesn’t make much difference (in fact, resnet34 is slightly better this way)!
For the SIIM dataset, the results are more intuitive for convnext - as the images are x-rays, they do have quite different stats than ImageNet and it is better to normalize using those. Yet, it is even better not to normalize at all! For resent34 the results are actually reversed! And here I was, hoping this experiments will actually clarify things…
Either way, the standard practice of normalizing using imagenet_stats seems to be the safest bet, although not necessarily always the best.
Segmentation is where the results are not really that much more conclusive, apart from the TCGA (medical) dataset, where no normalisation gives really bad results, while using the actual (rather than ImageNet) stats helps quite a bit!
Not really sure what the take home message is here. Provided I don’t have any bugs in the code, it’s rather hard to formulate any strong conclusions here