Using images of different sizes during training a ConvNet

(Shivam) #1

In lecture 7 cifar-10 notebook’s Convnet implementation, the convolutional layer in PyTorch takes just input and output channels as parameters. Hence it assumes no image size requirement. And for classification of ‘N’ labels we just use AdaptivePool to get desired last layer size.
However in the get_data function we are still passing the size of image (32 x 32). Is it possible to keep original dimensions of images while training convnet and just use tfms for data augmentation?
Also is it in general a good idea to use original size image of varying size or resize them all to same size first?

(Matthijs) #2

YOLO uses something called multi-scale training. Every 10 batches it changes the input size of the model. It randomly picks a size from 320 to 608 (in steps of 32). This is done to make the model more robust to images of different input sizes.

(Shivam) #3

@machinethink thanks. Can we use such functionality with fastai?