Lesson 2 In-Class Discussion ✅

RajeshMappu · October 31, 2018, 4:07am

with torch no_grad to be precise, and it just stops tracking all history of gradients. As after calculation it wasn’t needed in that lines of code. And then we are randomizing gradients with grad.zero_

Short answer, Yes. And to be precise, randomize the gradients before next loss function calculation.

Mauro · October 31, 2018, 4:07am

Ok. I’ll go back to it. Thanks.

Taka · October 31, 2018, 4:09am

grad.zero_ doesn’t ‘randomize’ gradients. It just sets them to zero in place.

Kirito · October 31, 2018, 4:09am

use conda install -c conda-forge ffmpeg solve the problem

sgugger · October 31, 2018, 4:10am

I’m talking about the unperfectness of the mini-batch as a representation of the whole dataset. One of the reason it may not be a prefect representaiton of the dataset is because it’s… too perfect

jm0077 · October 31, 2018, 4:10am

when the learner is training over several epochs, each epoch grabs differents batches of the whole data?

Benudek · October 31, 2018, 4:11am

you solved that? having same issue

shriram.jaju · October 31, 2018, 4:11am

The recorded live stream is missing initial 20-30 min.

ymittal23 · October 31, 2018, 4:11am

@sgugger Can you please help me with this

Taka · October 31, 2018, 4:14am

An epoch is basically one complete training pass over all of the training data.So, each epoch grabs all of the training data once, but may have different mini-batches within the epoch.

jm0077 · October 31, 2018, 4:17am

just to clarify, it means that for example epoch 1 and epoch 2, have differents minibatches of the same data, right?

Taka · October 31, 2018, 4:18am

Depends on how the mini-batches were determined…if it is random (like here), yes!

jm0077 · October 31, 2018, 4:22am

maybe is already responded previously, but how can I be sure that every epoch grabs the mini-batches in a random way?

PierreO · October 31, 2018, 4:25am

What’s the difference between that and starlette ?

Taka · October 31, 2018, 4:25am

I think that’s taken care of for you in fastai . I don’t know exactly how its split though…yet to dig into the source code/docs.

ram_cse · October 31, 2018, 4:28am

It is already done in fastai. Under the hood it uses shuffle=True for training data and shuffle=False for validation and test data in torch.utils.data.DataLoader.
torch.utils.data.DataLoader

fredguth · October 31, 2018, 5:03am

nope. I didn’t

avishalom · October 31, 2018, 5:04am

Is the stream down, is it down just for me?

PierreO · October 31, 2018, 5:05am

The stream has ended but the video is still available at the same link, and it’s working for me.

larcat · October 31, 2018, 5:08am

It did for me as well. I’ve got a post up in the gradient thread about it.