Lesson 2 In-Class Discussion ✅

miwojc · October 31, 2018, 3:49am

why is the animation slowing down when closer to correct line? like smaller steps?

sgugger · October 31, 2018, 3:49am

It’s because your gradients become smaller and smaller, so the steps are smaller and smaller.

shreyasjag · October 31, 2018, 3:50am

okay. Thats convenient

shwetap7 · October 31, 2018, 3:50am

what is the best learning rate value to start with?

PierreO · October 31, 2018, 3:50am

You can understand why the gradient gets smaller by looking at the curve Jeremy showed earlier (the U shaped one). The gradient is basically the slope of that curve, and the closer you get to the minimum, the closer to 0 the slope becomes.

evan.xiong · October 31, 2018, 3:50am

I was asked the same question.

ram_cse · October 31, 2018, 3:51am

dotkay · October 31, 2018, 3:51am

does it make a difference (final result) if I choose mini-batches randomly vs sequentially in some order, say 0…31, 32…63, etc?

sgugger · October 31, 2018, 3:52am

It does: during training it is best to have random indexes to build your mini-batches, this will avoid the model learning the answers in order.

manan · October 31, 2018, 3:53am

How many is “too many” epochs?

Mauro · October 31, 2018, 3:53am

initially 3e-3

lesscomfortable · October 31, 2018, 3:53am

There is no way to know beforehand. That’s why it is good to train for some epochs and see if the validation loss is still decreasing. If the loss start to improve very little, you should stop training.

sgugger · October 31, 2018, 3:53am

Jeremy gave some elements to answer that earlier, but sadly there’s no good answer to that question.

aidan.davis · October 31, 2018, 3:54am

How is learning answers in order possible?

sgugger · October 31, 2018, 3:55am

It’s more like an image. But the model will lean better with shuffled images fed to him, this has been proven experimentally.

KarlH · October 31, 2018, 3:55am

When we update the parameters in the with torch.to_grad() step, does that also zero the gradients before the next loss calculation?

rreilly · October 31, 2018, 3:56am

anyone else experiencing “! sudo add-apt-repository ppa:mc3man/trusty-media” failing?

shub.chat · October 31, 2018, 3:56am

Are all batch grabs within an epoch without replacement.Means in all cases will each batch be absolutely different than the last?

cedric · October 31, 2018, 3:56am

Rachel’s video: There’s no such thing as “not a math person” reference during the class.

lesscomfortable · October 31, 2018, 3:57am

In one epoch yes, no batches are repeated.