Just a little correction: your timeline (thanks for this!) seems to indicate I wrote the 1cycle paper, which isn’t true. It’s Leslie Smith who did, I just wrote a blog post about it
Hi Shubhaijt. Please don’t tag Jeremy and Rachel unless other people in the forums cannot help you, see Etiquette for Posting to Forums.
The ‘highmem’ in your instance means you have high RAM Memory but the error you are getting refers to GPU memory, these are not related. Have you tried decreasing the batch size to 32?
Thanks for informing, @lesscomfortable .
Okay, I will try decreasing batch-size to 32.
But, as this GCP instance is the recomended one, so I think the problem must not exist!!
In the 1st video of earlier version of the FastAI course, I could be wrong but I do remember Jeremy saying that for images of real world objects (that are used in Imagenet); the model would do well even on any set of images chosen by the participants as long as they were of day to day real world objects (like the ones in Imagenet).
Also, in the Galaxy Zoo link mentioned by you, the author specifically says " Transfer learning by pre-training a deep neural network on another dataset (say, ImageNet), chopping off the top layer and then training a new classifier, a popular approach for the recently finished Dogs vs. Cats competition, is not really viable either".
It is good that you need to change the batch size according to your GPU’s capacity. Together with lr_finder it is something we need to do everytime we approach a new problem and it is also an opportunity to get familiar with the ImageDataBunch class.
I guess it makes sense, but I’m just trying to confirm I haven’t done anything wrong and to get a better intuition of why the results are worse.
Is it because ImageNet doesn’t have the right nodes to reliably activate for features that make a difference in categorising images of things it hasn’t seen before (e.g. bands of stars) vs things it has (e.g. facial features, geometric shapes etc)? Would it be better to start with a different model? A larger sample/ data set?
I would suggest that you wait till Jeremy reverts back on your notebook shared with him.
Is it because ImageNet doesn’t have the right nodes to reliably activate for features that make a difference in categorising images of things it hasn’t seen before (e.g. bands of stars) vs things it has (e.g. facial features, geometric shapes etc)?
Perhaps. Maybe in the initial lower layers, some of the learned features could be common between galaxies and real world objects.
But in the higher composed layers, the galaxy specific features may have not been learnt.
Would it be better to start with a different model?
Maybe, yes.
A larger sample/ data set?
Maybe, not.
But I would still suggest to wait till Jeremy’s reply.
Also, in the 2nd video of the previous course at around 43:49, Jeremy had stated the following -
* Images like satellite images, CT scans, etc have totally different kinds of features all together (compare to ImageNet images), so you want to re-train many layers.
* For dogs and cats, images are similar to what the model was pre-trained with, but we still may find it is helpful to slightly tune some of the later layers.
Check in Training: resnet50 data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=299, bs=48)
You can reduce the batch size for e.g. bs = 32