Lesson 2 In-Class Discussion

Hi I am getting this error from running:

from fastai.transforms import *

File “fastai/torch_imports.py”, line 26
if pre: load_model(m, f’{path}/weights/{fn}.pth’)
^
SyntaxError: invalid syntax

Is there something I am missing out here.

It is because VGG has fully connected layers which have a fixed number of weights – which depend on the required input size.
These links are helpful:


https://www.quora.com/How-is-Fully-Convolutional-Network-FCN-different-from-the-original-Convolutional-Neural-Network-CNN

3 Likes

That makes sense! I did try printing out what zip returned, and it just told me that it was a zip object :sweat_smile:

Thank you for a great lesson yesterday!

@jeremy will the Dog Breeds Walkthrough also be setup on AWS so one can step thru it?

When center cropping an image (see red boarder), we may loss the important details (head and paws in this case). Should we use image re-sizing from rectangular to square instead?

8 Likes

We’ve had this question a few times already - please do a ‘search’ on the forum before posting. This error means that you’re not using python 3.6.

1 Like

Excellent question. This type of resizing is what keras does by default. I’ve found that it seems to generally work less well, since it has to learn how images look different depending on how they’re squeezed. But we do have the ability to use this squeezing approach in fastai - maybe @yinterian could show an example?

8 Likes

No, because it’s an active competition, so I’m not allowed to under kaggle rules. But replicating it yourself would be a great exercise.

1 Like

This is going to depend on the problem. For many problems center cropping will be fine for other problems you may want to resize. You can do both with the fast.ai library.

1 Like

Curious if anyone has tried ResNet style architecture for language models. Would be interesting to see what kind of features initial layers would capture and how starting with smaller sequences and retraining on larger ones (like how we did in class with starting with smaller size images and changing to larger ones) would affect the model.

1 Like

A recent architecture called ‘Transformer’ uses a ResNet style block for NLP. It worked really well.

5 Likes

In the lecture, Jeremy says that we are using Adam and the fastai library is trying to find an optimal learning rate given this setting.

Questions

  1. When Jeremy says this approach is something new, does that mean finding ‘lr’ on top of ‘Adam’ optimizer?
  2. If point1 is correct, does that mean ‘Adam’ in itself finds an optimal lr but we are trying to help it explicitly by providing an optimal ‘lr’?
1 Like

Would it be possible to support image sizes that aren’t squares? For now it looks like sz only accepts one integer value which it then turns into a tuple for resizing.

1 Like

Statement: In the lecture, Jeremy says that precompute = True will take time 1st time and then utilize these precomputed activations in future (Assuming AWS).

Question

  1. Since we are using predefined architecture so the time factor is because it’s downloading the required activations from the interent or there is some sort of computation happening for these ‘precomputed activations’?

In part 1 v1 , we used one hot encoding for transforming the RGB values. In this version(v2) we haven’t, is this related to Resnet architecture.

Statement: In the lecture Jeremy says that the technique SGDR to come out of local minima?

Questions:

  1. Pls. confirm if statement is true.
  2. If true, then how does the above statement fit with the statement from the session1 that because we’ve so many features in deep learning we never see local minima
1 Like

If I run learn.lr_find() after training for a bit does it affect my trained weights or its run independently?

1 Like

What does ps do?

learn = ConvLearner.pretrained(arch, data, precompute=True, ps=0.5)

Checked the code and it says ps is dropout parameters?
But no idea what that means.

Thats right. Dropout is a hyper parameter to control overfilling by dropping random % of nodes in a layer.

1 Like

Ok. Thanks… And how do I notice overfitting, so that I can use the ‘ps’ hyper parameter?