Just finished lesson 2 and I was able to grasp everything except this. Jeremy said an Epoch is “looking at all the data once”
but what does it mean to have 2 epochs? If the loss post SGD is ,let’s say 2, after 1 epoch, how will the 2nd epoch borrow info from the first?
What will the 2nd epoch do? start where the first one left off? or just do a separate SGD with different starting parameters?
Can’t wrap my head around it!!!
You can find a lot of explanations on internet.
As an example in that link:
Finally, let’s make this concrete with a small example.
Assume you have a dataset with 200 samples (rows of data) and you choose a batch size of 5 and 1,000 epochs.
This means that the dataset will be divided into 40 batches, each with five samples. The model weights will be updated after each batch of five samples.
This also means that one epoch will involve 40 batches or 40 updates to the model.
With 1,000 epochs, the model will be exposed to or pass through the whole dataset 1,000 times. That is a total of 40,000 batches during the entire training process.
Thank you! That cleared it up! so it is essentially just a continuation of the process
Even I am slightly confused with the concept of epoch. Say that for the first time I train my network with 20 epochs, and I find out that at 8th epoch I get the lowest error, should I retrain my network with 8 epochs? And should that be done after unfreezing or without unfreezing?
Please suggest as I am slightly confused with the steps.
Hey if you’re error gets consisently worse after the 8th epoch, then yes you overfit. So you should stop somewhere around the 8th epoch.
When you freeze your network and train, only the last two layers get trained. When you unfreeze all of them get trained, So just train a few epochs when with freezing. Then unfreeze and train some more as long as your loss keeps decreasing