How are wt-103 weights so small?

When you save your weights, you also save the optimizer state (unless you say with_opt=False). That’s where the difference comes from.

5 Likes