Normalize for pretrained=False resnet

hwasiti · November 30, 2018, 7:51pm

Jeremy was kind to reply my question in-class after getting 13 votes. And I am copying the answer from @hiromi 's lesson 3 notes :

For a dataset very different than imageNet like the satellite images or genomic images mutation points shown in lesson 2, we should use our own stats.

Jeremy once said in the forum:

If you’re using a pretrained model you need to use the same stats it was trained with.

Why it is that? Isn’t it that, normalized dataset with its own stats will have roughly the same distribution like imageNet?
1:46:53]

Jeremy :
Nope. As you can see, I’ve used pre-trained models for all of those things. Every time I’ve used an ImageNet pre-trained model, I’ve used ImageNet stats. Why is that? Because that model was trained with those stats.

For example, imagine you’re trying to classify different types of green frogs. If you were to use your own per-channel means from your dataset, you would end up converting them to a mean of zero, a standard deviation of one for each of your red, green, and blue channels. Which means they don’t look like green frogs anymore. They now look like grey frogs. But ImageNet expects frogs to be green. So you need to normalize with the same stats that the ImageNet training people normalized with. Otherwise the unique characteristics of your dataset won’t appear anymore﹣you’ve actually normalized them out in terms of the per-channel statistics. So you should always use the same stats that the model was trained with.