Lesson 2 In-Class Discussion ✅

with torch no_grad to be precise, and it just stops tracking all history of gradients. As after calculation it wasn’t needed in that lines of code. And then we are randomizing gradients with grad.zero_

Short answer, Yes. And to be precise, randomize the gradients before next loss function calculation.

Ok. I’ll go back to it. Thanks.

1 Like

grad.zero_ doesn’t ‘randomize’ gradients. It just sets them to zero in place.

2 Likes

use conda install -c conda-forge ffmpeg solve the problem

1 Like

I’m talking about the unperfectness of the mini-batch as a representation of the whole dataset. One of the reason it may not be a prefect representaiton of the dataset is because it’s… too perfect :wink:

1 Like

when the learner is training over several epochs, each epoch grabs differents batches of the whole data?

you solved that? having same issue :wink:

The recorded live stream is missing initial 20-30 min.

@sgugger Can you please help me with this

An epoch is basically one complete training pass over all of the training data.So, each epoch grabs all of the training data once, but may have different mini-batches within the epoch.

2 Likes

just to clarify, it means that for example epoch 1 and epoch 2, have differents minibatches of the same data, right?

Depends on how the mini-batches were determined…if it is random (like here), yes!

1 Like

maybe is already responded previously, but how can I be sure that every epoch grabs the mini-batches in a random way?

What’s the difference between that and starlette ?

I think that’s taken care of for you in fastai . I don’t know exactly how its split though…yet to dig into the source code/docs.

1 Like

It is already done in fastai. Under the hood it uses shuffle=True for training data and shuffle=False for validation and test data in torch.utils.data.DataLoader.
torch.utils.data.DataLoader

6 Likes

nope. I didn’t

Is the stream down, is it down just for me?

The stream has ended but the video is still available at the same link, and it’s working for me.

It did for me as well. I’ve got a post up in the gradient thread about it.