I’ve watched the first lesson and started to do some homework. I decided to try and train a model using the CIFAR10 dataset. I’ve pretty much followed the same lines of code provided in the first exercise, and while during the lesson Jeremy mentioned that after unfreezing the model we should get worse results, I’ve noticed the opposite, the error rate was actually lower after unfreezing. Here’s a screenshot of the epochs (after cleaning the learning-rate plotting and all) -
I was wondering - do you guys know why?
Is it possible this happens because the ‘CIFAR10’ dataset has more images than the ‘Oxford-IIIT Pet’ one and because of that it makes more sense to re-train all the layers?
You are training with the full Net unfrozen so the Error rate goes down. Their is a overfit risk here otherwise it makes sence that you get better results.
Also note that you are trainging with the hole network unfrozen for 3 full epoch. 2 epoch the full network is unfrozen and with no slice information.
after learn.unfreeze() it’s only 1 epoch and it is get better. Why in Jeremy video it get worse? Jeremy mentioned because we are training the whole network instead of just last layer.