Does saving a pre-trained frozen model also saves the weights of the frozen layers?

marc · March 23, 2019, 3:02am

I have a pre-trained resnet34 model. First I train it for a few epochs and save the model. The filesize is 48.2MB.
I then unfreeze the model and continue training for another few epochs. When I save it, the size is 218.3MB.
I then freeze it again and continue training. At the end I save the model, and the filesize is 48.2MB again.
Is it correct that when I save the frozen model, only the weights for the last (unfrozen) layer are saved?
If I load the frozen model in the future, are the weights of all layers except the last ones initiated with the “standard” resnet34 weights?

marc · March 24, 2019, 6:33am

I ran some experiments and came to the conclusion that the weights from both the frozen and unfrozen layers are saved. The difference in file size between a frozen model and unfrozen is because save() by default also saves the optimizer state.

dipam7 · March 25, 2019, 6:13am

Hey, interesting experiment. Although, I don’t quite follow the explanation. Can you please elaborate about what it means to save the optimizer state? and how that causes the changes in file size?
Thanks

zhanglu · March 25, 2019, 8:17am

It’s True. freeze model doesn’t have grad data,but the unfreeze model includes.
you can try this code:
print(learn.model.named_parameters())[0][1].grad)
print(learn.model.named_parameters())[0][1].data)

adityak2920 · July 15, 2019, 6:37am

But why do we need to save grad data with models when the model is in unfreezed state?

rohit_gr · July 15, 2019, 10:40am

learn.save saves the optimizer state by default which contains momentum_buffer for the layers. After unfreezing, this buffer increases so is the .pth file size. I think there is no grad saved. Try this:

learn.freeze()
learn.fit_one_cycle(1, 1e-2)
learn.save('stage-1', with_opt=False)
learn.unfreeze()
learn.fit_one_cycle(1, slice(1e-5, 1e-3)
learn.save('stage-2', with_opt=False)

And check the .pth file sizes now.