Understanding get_data lesson 2v2

Here is what I believe the get_data function looks like
<def get_data(sz,bs):
aug_tfms = transforms_side_on,
max_zoom =1.1)
data = ImageClassifierData.from_csv(PATH,
suffix = ‘.jpg’,
tfms = tfms,
bs = bs)
return data if sz>300 else data.resize(340,‘tmp’)
The thing I do not understand is the return statemet which seems to be saying if sz is greater than 300 just return data else return data.resaize(340,‘tmp’)
all the sz have been below 300 so it is the second part we are interested in.and the bit I do not understand.
I can tell you that the practical effect is to createa directory dogbreeds/tmp/340
which has in it
Inside test and train are the images which seem to have been resized so that there size is (n,340)
where n varies, probably whatever it was originally.
Can anybody explain what is happening?

I recall Jeremy mention that he uses the resize to make the data loading go faster.

Its possible that your confusion maybe stemming from the order of the function calls:

tf = tfms_from_model(...bhah...)
data = ImageClassifierData.from_path(...blah..., tf, ...blah...)

At this point, neither transformation nor any resizing has been applied to the images. However, if you choose to speed things up, you can call data.resize which does the following:

  1. resizes the the images (hopefully something smaller!)
  2. stores them in a folder (/tmp in this case ) AND
  3. returns a different (but similar) data object that is pointing to the tmp folder from (2).

Afterwards, your learner can go on using data along with any transformation you specified in the tf step, but on the smaller images instead.

EDIT: and for transformation size sz that is already over 300, I guess he decided to go ahead with using the original/bigger images

At least, thats what I understood :slight_smile:


Asif, I think you are correct. Having now listened to the 3rd lesson where Jeremy explains it.