Lesson 2 In-Class Discussion ✅

why is the animation slowing down when closer to correct line? like smaller steps?

1 Like

It’s because your gradients become smaller and smaller, so the steps are smaller and smaller.

2 Likes

okay. Thats convenient

what is the best learning rate value to start with?

You can understand why the gradient gets smaller by looking at the curve Jeremy showed earlier (the U shaped one). The gradient is basically the slope of that curve, and the closer you get to the minimum, the closer to 0 the slope becomes.

1 Like

I was asked the same question.

1 Like

does it make a difference (final result) if I choose mini-batches randomly vs sequentially in some order, say 0…31, 32…63, etc?

It does: during training it is best to have random indexes to build your mini-batches, this will avoid the model learning the answers in order.

5 Likes

How many is “too many” epochs?

1 Like

initially 3e-3

There is no way to know beforehand. That’s why it is good to train for some epochs and see if the validation loss is still decreasing. If the loss start to improve very little, you should stop training.

1 Like

Jeremy gave some elements to answer that earlier, but sadly there’s no good answer to that question.

How is learning answers in order possible?

It’s more like an image. But the model will lean better with shuffled images fed to him, this has been proven experimentally.

When we update the parameters in the with torch.to_grad() step, does that also zero the gradients before the next loss calculation?

1 Like

anyone else experiencing “! sudo add-apt-repository ppa:mc3man/trusty-media” failing?

Are all batch grabs within an epoch without replacement.Means in all cases will each batch be absolutely different than the last?

Rachel’s video: There’s no such thing as “not a math person” reference during the class.

10 Likes

In one epoch yes, no batches are repeated.

2 Likes