Can't load checkpoint saved by SaveModelCallback

Hi,

I’m trying to load with torch.load a model .pth saved via SaveModelCallback.

And I’m getting the next error:RuntimeError:

Error(s) in loading state_dict for ResNet:

Missing key(s) in state_dict: "conv1.weight", ...

…Unexpected key(s) in state_dict: “model”, “opt”.

I’m ensuring that the same version of fastai and pytorch is used to save the .pth checkpoint, and to load it later

fastai version: 1.0.51
numpy version: 1.16.2
pandas version: 0.24.2
torch version: 1.0.1.post2
torchvision Version: 0.2.2

I’m using the following code:

def load_url(*args, **kwargs):
    model_dir = Path('models')
    if not (model_dir/MODEL_FILENAME).is_file(): raise FileNotFoundError
    return torch.load(model_dir/MODEL_FILENAME)
model_zoo.load_url = load_url

learn = cnn_learner(data, base_arch=models.resnet152, loss_func=FocalLoss(), metrics=metrics)

I’m using the code above to load both: the initial imagenet pretrained weights, and my checkpoint .pth.

The imagenet pretrained provided by pytorch is working, but my checkpoint is giving this error

What am I doing wrong?

TIA,

Virilo

P.S. I have read in this forum and StackOverflow other people with the same problem. But these cases were related to the fastai version, or to inconsistency in the base_arch parameter

fastai usually saves model and optimizer, which is why you have those unexpected keys. Select the model and you should be good:

state = torch.load(bla)
model.load_state_dict(state['model'])
1 Like

thanks a lot @sgugger

I tried in this way. Now it reads the ‘model’ inside the state dictionary.

But it seems that layers names have been modified during the saving:

File "imet-fastai-starter.py", line 180, in load_checkpoint
    model.load_state_dict(state['model'])
  File "/opt/anaconda/anaconda3/envs/fp16/lib/python3.6/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ResNet:
    Missing key(s) in state_dict: "conv1.weight", "bn1.weight", "bn1.bias", "bn1.running_mean", "bn1.running_var", "layer1.0.conv1.weight", "layer1.0.bn1.weight", "layer1.0.bn1.bias", ...

...     Unexpected key(s) in state_dict: "0.0.weight", "0.1.weight", "0.1.bias", "0.1.running_mean", "0.1.running_var", "0.1.num_batches_tracked", ...

BTW, I’d like to load also the optimizer data. There must be important weights, like the ones related to momentum, because of I’d like to continue the training

Thanks in advance

If the keys are different, it’s not the same model you are trying to load.

1 Like

I solved it by loading differently the torch pretrained models, and the fastai saved checkpoints.

For the torch pretrained models, cnn_learner is using something very close to your code: just a constructor + torch.load + model.load_state_dict

For the checkpoints, Learner.load is handling both the ‘opt’ and the ‘model’ in the dictionary to do the load_state_dict

learn = cnn_learner(data, base_arch=base_arch, loss_func=FocalLoss(), metrics=metrics, opt_func=optimizer,
                    pretrained=not MODEL_IS_SAVED_CHECKPOINT)
if MODEL_IS_SAVED_CHECKPOINT:
    learn.load(MODEL_FILENAME.replace('.pth',''))

Perhaps I forgot to give you some details or I should have attached my code.

Thanks a lot for your help @sgugger !

Hi, did you load a saved fastai learner to a pytorch model in the end? I’m trying to load a saved fastai learner into pytorch but I’m facing the same problem that the keys are different:

learner = cnn_learner(data, models.resnet34, metrics=accuracy, bn_final=True)
learner.fit(...)
learner.save('model_path')

model = models.resnet34()    #  this is a pytorch model object implemented in fastai
state_dict = torch.load('model_path')
model.load_state_dict(state_dict)

@Iron4dam

I didn’t read your post, sorry. Did you solve it?

I think you should:

learner=cnn_learner(…)

and then:

learner.load(model_path)