Input dimensions for modern networks

mefmef · November 8, 2017, 4:07pm

VGG net and Alex net only accepted fixed input size (I believe 224*224). Is it changed in Resnet? If yes, how?
What is the fundamental difference there?

jeremy · November 8, 2017, 4:14pm

Yes, resnet’s penultimate layer is a pooling layer that pools down to a 1x1 size. Pretty much all modern architectures do this, and fastai does a neat trick that converts all architectures (including VGG) to this approach!

mefmef · November 8, 2017, 4:22pm

Thank you, So that I understand better:
You are saying that the reason that Alex net cannot deal with different input size is that the layer before FC layers would end up having different spatial dimensions (Width * Height * filter_bank_size)? And they fix it by pooling to (1 * 1 * filter_bank_size)?

jeremy · November 8, 2017, 4:23pm

Exactly! And nicely described