Lesson 1 In-Class Discussion ✅

bluesky314 · October 23, 2018, 3:50am

In the pretrained conv model, how many layers are added at the end and how can we change that?

sgugger · October 23, 2018, 3:50am

To draw this graph, a training is launched with a very low learning rate that improves at every batch.

tschoy · October 23, 2018, 3:50am

What happend to bn_unfreeze()? Still needed?

dusan · October 23, 2018, 3:50am

Should we always use as large batch size as possible on our card? Or is there a limit when model performance decreases?

SHAR1 · October 23, 2018, 3:51am

Yup, intuition behind this is the fact that we are taking a correlation between neighbouring pixels only. Like, 3x3 or 5x5 filters, even though we are passing this through the whole image, it doesn’t preserve the orientational and relative spatial relationships between these complex features (eyes, nose, face boundary). Capsule networks address this issue.

sgugger · October 23, 2018, 3:51am

It is the prediction of the model. You shouldn’t interpret it as a real probability

Interogativ · October 23, 2018, 3:51am

yes, use as big a bs as you can without running out of gpu ram

nikhil.ikhar · October 23, 2018, 3:51am

@paul I m interested too.

sgugger · October 23, 2018, 3:51am

Nope.

bluesky314 · October 23, 2018, 3:52am

This is false, it does preserve the locations in the activation maps

whatrocks · October 23, 2018, 3:52am

With regard to the awesome Fast.ai alumni building cool products with DL, is it okay to use transfer learning of an ImageNet model or public research datasets to build models for commercial products?

meij · October 23, 2018, 3:52am

Why does the Convlearner use a size of 224 for Resnet 34 and 299 for Resnet 50?

avatar · October 23, 2018, 3:52am

@rachel could you please ask this question?

timbo72 · October 23, 2018, 3:52am

so I’ve been running the notebook alongside and i’ve notices that my error is a bit higher (~0.07ish).
Is this difference the ‘resiliance’ that was mentioned earlier? Or is it a symptom of something else?

PierreO · October 23, 2018, 3:52am

So the last number is the prediction of the model for the actual class, is capped at 1.00 but isn’t a probability ? So then how the model choose another class ?

ertan · October 23, 2018, 3:53am

Validation loss seemed to be higher than the training loss. Does that not mean we were overfitting a bit?

Kaushikjais · October 23, 2018, 3:53am

Is train error and validation error shown is in percentage form or we need to multiply it by 100 like we have been doing for error rate like 0.044 error rate is 4.4% of error rate

rachel · October 23, 2018, 3:53am

Yes, thanks for highlighting it

santhoshetty · October 23, 2018, 3:53am

Train loss is 0.09 and validation loss is 0.13. Doesn’t this mean, overfitting is occuring?

andavargas · October 23, 2018, 3:53am

Imagine trying to walk to the highest point in a landscape. The learning rate is kind of like how big your stride is. If your stride is too big, you’ll get to the top of the mountain faster, but it will be hard to pinpoint the very highest point once you’re close because you’ll keep stepping past it.