How are wt-103 weights so small?

yeldarb · April 24, 2019, 5:55pm

I’m looking at the pre-trained weights for WT-103 and it looks like they’re 169M on disk. When I fine-tune and save my weights they end up being closer to 500M.

Why the 3x difference in size? And is there a way to similarly reduce the file size of my saved weights?

sgugger · April 24, 2019, 6:10pm

When you save your weights, you also save the optimizer state (unless you say with_opt=False). That’s where the difference comes from.

haverstind · April 24, 2019, 7:25pm

Thank you - that is actually also the answer to a question I posted previously: File size of resnet18 model (pretrained vs. from scratch)