How are wt-103 weights so small?

I’m looking at the pre-trained weights for WT-103 and it looks like they’re 169M on disk. When I fine-tune and save my weights they end up being closer to 500M.

Why the 3x difference in size? And is there a way to similarly reduce the file size of my saved weights?

When you save your weights, you also save the optimizer state (unless you say with_opt=False). That’s where the difference comes from.


Thank you - that is actually also the answer to a question I posted previously: File size of resnet18 model (pretrained vs. from scratch)