Why am I getting better results after unfreezing?

fast-michael · September 28, 2019, 3:28pm

Hi everyone

I’ve watched the first lesson and started to do some homework. I decided to try and train a model using the CIFAR10 dataset. I’ve pretty much followed the same lines of code provided in the first exercise, and while during the lesson Jeremy mentioned that after unfreezing the model we should get worse results, I’ve noticed the opposite, the error rate was actually lower after unfreezing. Here’s a screenshot of the epochs (after cleaning the learning-rate plotting and all) -

I was wondering - do you guys know why?
Is it possible this happens because the ‘CIFAR10’ dataset has more images than the ‘Oxford-IIIT Pet’ one and because of that it makes more sense to re-train all the layers?

Or am I just missing something?

Thanks!
Michael.

diskandar · October 7, 2019, 1:07pm

I am having the same experience … anybody?

Bjorn · October 7, 2019, 1:51pm

You are training with the full Net unfrozen so the Error rate goes down. Their is a overfit risk here otherwise it makes sence that you get better results.
Also note that you are trainging with the hole network unfrozen for 3 full epoch. 2 epoch the full network is unfrozen and with no slice information.

diskandar · October 7, 2019, 2:08pm

after learn.unfreeze() it’s only 1 epoch and it is get better. Why in Jeremy video it get worse? Jeremy mentioned because we are training the whole network instead of just last layer.

they are contradicting?

Bjorn · October 7, 2019, 2:29pm

Hi both cell 63 and 64 trains with unfrozen data

diskandar · October 7, 2019, 2:33pm

got you we are talking different point.
I am talking about from 51 to 58, it get better after unfreezing. Again in Jeremy video it get worse.

I am confused. or it is what it is.