How to generate and load a pretrained model?

Hello there. There are already a few questions relateted to this but still I cannot find a working solution for the current version (2.6.0) of fastai. The questions is this: How can I use a classifier (resnet18) model that I trained in the past on N classes as a pretrained model for all these classes plus a new one (N+1 classes). How can I choose, which pretrained weights are to be loaded instead of the pytorch hub ones? How can I save a model from a vision_learner in a format that is usable as pretrained weights? If I just do a new dataloader on the new class folders and try to load the previous model, then it says this:

RuntimeError: Error(s) in loading state_dict for Sequential: size mismatch for 1.8.weight: copying a param with shape torch.Size([85, 512]) from checkpoint, the shape in current model is torch.Size([33, 512]).

In this example there were 85 classe pretrained and then I wanted to use that model as pretrained weights for a reduced 33-class problem with some overlapping classes with the 85 model. It is clear, that the incompatibility is about the last linear layer (number of outputs), however, the pretrained models from pytorch are trained on 1000 classes, right? So there must be a way to load models that were trained on a different number of classes.

So using vision_learner to load a model with different pretrained weights isn’t supported yet, but timm may add that feature soon, and it will then be supported in fastai.

In the meantime, the best approach is to create your own PyTorch model and then load into fastai.

model = resnet18()
model.fc = nn.Linear(..., 33) # you need to figure out the num_features from the previous layer
learn = Learner(dls, model, splitter=default_splitter, ...)

This won’t be exactly the same as vision_learner (which makes a special type of head instead but then finally passes everything back into Learner), but should be close enough for now. Hope this helps!

1 Like

Is there some writeup that explains why this is so? Because from a conceptual perspective resnet18 and the 85 class model OP mentioned should be the same as far as the fastai visionlearner is concerned. From a naive / beginner (my) perspective, vision_learner takes a pretrained model, attaches a new head and then does the training. ie; why can’t it or doesn’t it treat a custom trained model the same as resnet18 or resnet34 etc?

1 Like

You’re right, I think it’s possible you could create a function that takes in a pretrained argument and loads the desired model. Something like this I guess?

def resnet18_custom(pretrained=True):
    model = resnet18()
    return model

Note that just like the previous method that I shared, this doesn’t have the same splitting (for discriminative LRs) as the regular ResNet18 model. I think this might resolve it:

model_meta[resnet18_custom] = model_meta[resnet18] # model_meta is a dictionary imported from

I think this should work but the OP can try both methods and let us know what works.

1 Like

Thank you, I will try this!

I tried a little notebook like this:

from import *
from pathlib import Path
import torch
import torchvision

p = untar_data(URLs.MNIST_TINY)
dls = ImageDataLoaders.from_folder(p)
learn = vision_learner(dls, resnet18)
learn.fine_tune(6), "test.pth")
def resnet18_custom_pretrained(pretrained=True):
    model = resnet18()
    return model
model_meta[resnet18_custom_pretrained] = model_meta[resnet18]
learn2 = vision_learner(dls, resnet18_custom_pretrained)

But this doesn’t work. It seems that there are differnt formats used for the keys in the state dicts. Here is an excerpt of the (huge) error message:

RuntimeError: Error(s) in loading state_dict for ResNet:
    Missing key(s) in state_dict: "conv1.weight" ...
    Unexpected key(s) in state_dict: "0.0.weight", ...

Any idea what’s going on here?

OK, after a bit of search it seems that when a vision model is created the model is put into another format with the line:


inside the function create_body in This seems to change the key names of all the layers inside the network. I found a solution by only saving the body of a trained network:

p1 = untar_data(URLs.MNIST)
p2 = untar_data(URLs.MNIST_SAMPLE)
dls1 = ImageDataLoaders.from_folder(p1 / "training", valid_pct=0.5, num_workers=0)
dls2 = ImageDataLoaders.from_folder(p2, num_workers=0)

learn1 = vision_learner(dls1, resnet18, pretrained=True, metrics=accuracy)
lr = learn1.lr_find()
learn1.fine_tune(4, lr[0])
body_to_save = list(learn1.model.children())[0], "test.pth")

learn2 = vision_learner(dls2, resnet18, pretrained=False, metrics=accuracy)
learn2.fine_tune(1, lr[0])

learn3 = vision_learner(dls2, resnet18, pretrained=False, metrics=accuracy)
body_to_load = list(learn3.model.children())[0];
learn3.fine_tune(1, lr[0])

In this toy example learn3 seems to achieve consistently a better accuracy after the first epoch compared to learn2. This could be because learn3 loads the pretrained body (trained on full MNIST) from learn1. But I am not very sure if this is all correct. Maybe some more experienced fastai user can comment on this.