Beginner: SGD and Neural Net foundations ✅

strickvl · April 30, 2022, 12:21pm

I’ve noticed that whenever you train for a single epoch, several things happen underneath the cell.

Training runs for a single epoch to fit / connect the head of the pre-trained model to our new random head for the purposes of whatever we’re doing
Then it trains again for a single epoch (as specified), updating as per our training data.

But for each epoch, it seems like there are two separate stages. One is slow (what I think of in my head as the ‘real training’) where the progress bar completes one cycle of 0-100% and then it again goes through the progress bar from 0-100% quite a bit faster.

What are those two separate processes going on underneath? (Possible we might find out about those at a later stage, in which case feel free to tell me just to wait until a later lesson ) Should I think of them as two separate processes? Is it some kind of consolidation or calculation that’s being represented there?