with torch no_grad to be precise, and it just stops tracking all history of gradients. As after calculation it wasn’t needed in that lines of code. And then we are randomizing gradients with grad.zero_
Short answer, Yes. And to be precise, randomize the gradients before next loss function calculation.
I’m talking about the unperfectness of the mini-batch as a representation of the whole dataset. One of the reason it may not be a prefect representaiton of the dataset is because it’s… too perfect
An epoch is basically one complete training pass over all of the training data.So, each epoch grabs all of the training data once, but may have different mini-batches within the epoch.
It is already done in fastai. Under the hood it uses shuffle=True for training data and shuffle=False for validation and test data in torch.utils.data.DataLoader. torch.utils.data.DataLoader