Lesson 2 In-Class Discussion

mmr · November 7, 2017, 8:21pm

Hi I am getting this error from running:

from fastai.transforms import *

File “fastai/torch_imports.py”, line 26
if pre: load_model(m, f’{path}/weights/{fn}.pth’)
^
SyntaxError: invalid syntax

Is there something I am missing out here.

resmi · November 7, 2017, 9:12pm

It is because VGG has fully connected layers which have a fixed number of weights – which depend on the required input size.
These links are helpful:

https://www.quora.com/How-is-Fully-Convolutional-Network-FCN-different-from-the-original-Convolutional-Neural-Network-CNN

hiromi · November 7, 2017, 10:45pm

That makes sense! I did try printing out what zip returned, and it just told me that it was a zip object

Thank you for a great lesson yesterday!

bdekoven · November 7, 2017, 11:04pm

@jeremy will the Dog Breeds Walkthrough also be setup on AWS so one can step thru it?

Moody · November 7, 2017, 11:33pm

When center cropping an image (see red boarder), we may loss the important details (head and paws in this case). Should we use image re-sizing from rectangular to square instead?

jeremy · November 7, 2017, 11:43pm

We’ve had this question a few times already - please do a ‘search’ on the forum before posting. This error means that you’re not using python 3.6.

jeremy · November 7, 2017, 11:44pm

Excellent question. This type of resizing is what keras does by default. I’ve found that it seems to generally work less well, since it has to learn how images look different depending on how they’re squeezed. But we do have the ability to use this squeezing approach in fastai - maybe @yinterian could show an example?

jeremy · November 7, 2017, 11:45pm

No, because it’s an active competition, so I’m not allowed to under kaggle rules. But replicating it yourself would be a great exercise.

yinterian · November 7, 2017, 11:57pm

This is going to depend on the problem. For many problems center cropping will be fine for other problems you may want to resize. You can do both with the fast.ai library.

anz9990 · November 8, 2017, 2:35am

Curious if anyone has tried ResNet style architecture for language models. Would be interesting to see what kind of features initial layers would capture and how starting with smaller sequences and retraining on larger ones (like how we did in class with starting with smaller size images and changing to larger ones) would affect the model.

jeremy · November 8, 2017, 4:50am

A recent architecture called ‘Transformer’ uses a ResNet style block for NLP. It worked really well.

vikbehal · November 8, 2017, 6:49am

In the lecture, Jeremy says that we are using Adam and the fastai library is trying to find an optimal learning rate given this setting.

Questions

When Jeremy says this approach is something new, does that mean finding ‘lr’ on top of ‘Adam’ optimizer?
If point1 is correct, does that mean ‘Adam’ in itself finds an optimal lr but we are trying to help it explicitly by providing an optimal ‘lr’?

jamesrequa · November 8, 2017, 6:58am

Would it be possible to support image sizes that aren’t squares? For now it looks like sz only accepts one integer value which it then turns into a tuple for resizing.

vikbehal · November 8, 2017, 7:29am

Statement: In the lecture, Jeremy says that precompute = True will take time 1st time and then utilize these precomputed activations in future (Assuming AWS).

Question

Since we are using predefined architecture so the time factor is because it’s downloading the required activations from the interent or there is some sort of computation happening for these ‘precomputed activations’?

rajath · November 8, 2017, 7:39am

In part 1 v1 , we used one hot encoding for transforming the RGB values. In this version(v2) we haven’t, is this related to Resnet architecture.

vikbehal · November 8, 2017, 7:59am

Statement: In the lecture Jeremy says that the technique SGDR to come out of local minima?

Questions:

Pls. confirm if statement is true.
If true, then how does the above statement fit with the statement from the session1 that because we’ve so many features in deep learning we never see local minima

jamesrequa · November 8, 2017, 7:59am

If I run learn.lr_find() after training for a bit does it affect my trained weights or its run independently?

arjunrajkumar · November 8, 2017, 8:18am

What does ps do?

learn = ConvLearner.pretrained(arch, data, precompute=True, ps=0.5)

Checked the code and it says ps is dropout parameters?
But no idea what that means.

sermakarevich · November 8, 2017, 8:50am

Thats right. Dropout is a hyper parameter to control overfilling by dropping random % of nodes in a layer.

arjunrajkumar · November 8, 2017, 8:54am

Ok. Thanks… And how do I notice overfitting, so that I can use the ‘ps’ hyper parameter?