When we were doing multilabel classification om planet dataset, we decreased the size of the data from 256 to 64. After augmenting data we have done,
data = data.resize(int(sz*1.3), ‘tmp’)
Jeremy says it’s for computational advantage resulting into speed-up. But don’t really understand, how exactly we would speed up? Do we ignore all the images with size greater than or equal to sz*1.3 and hence speed up, but that would be lots of images we would be not looking onto and will lead to limited learning.
Is resizing the images and copy their resized version in ‘tmp’ folder. I don’t see why he multiplies the size by 1.3.
Maybe the speed up is due to the fact that he saves the images in tmp folder.
def resize_imgs(fnames, targ, path, new_path):
if not os.path.exists(os.path.join(path,new_path,str(targ),fnames)):
with ThreadPoolExecutor(8) as e:
ims = e.map(lambda x: resize_img(x, targ, path, 'tmp'), fnames)
for x in tqdm(ims, total=len(fnames), leave=False): pass
After the assignment, tfms will pass to the data loader ImageClassifierData.from_csv(...). One of the things the data loader does is to resize the images on the fly. In other words, the images go into the data loader in full size, and it ONLY resizes the image when it outputs them to the model via Python’s generator.
Notice calling data.resize(int(sz * 1.3), 'tmp') allow us to resize the images in advanced. Here, data.resize() takes a maximum size (e.g. sz=64, so the maximum size is int(sz * 1.3) = 83), so each image has dimension bigger than sz * 1.3 will be resized. Since images can be rectangular, data.resize() will make sure the smallest dimension to be sz * 1.3 and center-crop it.
Importantly, it creates a resized copy of the entire dataset to a temporary folder tmp.
This technique will save you a lot of time because according to Jeremy, if you have large images, say 1000 by 1000, reading the images and resizing it to 64 by 64 before doing SGD during training will take more time than training the convolution nets alone! By calling data.resize() in advanced, we can resize the images once and do SGD multiple times efficiently.
AFAIK, if the dimensions of the image are both smaller than sz * 1.3, the image size remains the same. Otherwise, if one of the dimensions is larger than sz * 1.3, the smallest dimension will become sz * 1.3.
For example, if we have an 100 by 200 image and sz=64, using the 1.3 threshold, we want the maximum image dimension to be sz * 1.3 = 83. Now since the smallest dimension of the image is 100, we resize the dimension 100 to 83. Next, we center-crop the resized 83 by 200 image, so we’ll end up with a 83 by 83 image.