Another doubt – In the code @jeremy sets the size (sz=224) before passing it the function tfms_from_model.
Why was 224 picked? I saw that reducing the size from 224 to say 24, reduced the accuracy also to 75%. But curious why 224 was picked by default.
The images don’t start at 224 x 224, as ResNet34 was trained on, so a few calls down this function uses openCV to scale the images appropriately when preparing the data.
I’m surprised you did as well as 75%, since the images you created setting sz = 24 would have been roughly 1% the size of what the model “expected”, ignoring color-depth.
As for your original question, I’m not too sure; something to do with loading the pretrained model?
360 is the number of batches it takes to precompute the activations (~23000 images / 64 batch size) in the train set.
As @A_TF57 says, resnet34 was originally trained on 224x224 images IIRC, so it’s a reasonable starting point. We’ll get better results still with larger images. We’ll see that on Monday. Frankly, there’s nothing that special about 224x224, and I guess it’s kinda habit (it used to be that we had to use the exact same dimensions as the original model used, but that’s not the case any more).
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(resnet34, sz)) learn = ConvLearner.pretrained(resnet34, data, precompute=True)
I see that there are multiple GPU processes created and I wonder why?! The first process that takes up higher amount of GPU memory is the the actual training process, I presume. And what about the other smaller processes?
The other processes are for the preprocessing. They don’t actually need to use the GPU, so it would be nice to find a way to stop them taking up GPU memory…
Exactly what I thought. The DataLoaders use multiple workers to read data and apply transformations on the CPU is what I assumed. Correct me if I’m wrong.