One of the oldest conventions in deep learning, used also in fastai repo, is to normalize each input image by ‘imagenet_stats’.
This convention dates back to the days before batchnorm, where normalization was undeniably important.
Nowadays, every layer has a batchnorm, so ‘imagenet_stats’ can practically affect only the first layer.
I find that when training a model from scratch on imagenet, i can skip ‘imagenet_stats’ normalization and get top results. (i am training now xresnet50-like network and reaching >80% top-1 accuracy).
Wouldnt it be simpler just to generate pre-trained models without ‘imagenet_stats’ normalization, and just be able do finetuning forever without this step?
Has anyone in the recent years encountered a case where ‘imagenet_stats’ actually gave better results then no normalization (just divide all pixels by 255.0 )?