Create_learner from load_learner

I’m using an audio dataset from kaggle with two sets: curated and noisy (i could simply combine these, but wanna do it properly). I managed to use cnn to learn the curated set on the audio spectrograms, and I exported the results. Day later I start my jupiter and want to load this export - here’s how far I get:

data = ImageDataBunch.from_csv(

curated_learn = load_learner('.', 'export_1_freesound.pkl')'curated_weights')   # dunno how to skip this

learn = cnn_learner(data, models.resnet34, metrics=error_rate)

And this gives error

RuntimeError: Error(s) in loading state_dict for Sequential:
	size mismatch for 1.8.weight: copying a param with shape torch.Size([195, 512]) from checkpoint, the shape in current model is torch.Size([1030, 512]).
	size mismatch for 1.8.bias: copying a param with shape torch.Size([195]) from checkpoint, the shape in current model is torch.Size([1030]).

I tore this forum apart and it was only mentioned that I’m supposed to remove the last layer of the learned model, but I’m entirely not sure how this works or why this is necessary. In the course I’ve finished lesson 3.

# Name                    Version                   Build  Channel
fastai                    1.0.58                        1    fastai

So I manager to remove the last layer from the model

curated_learn = load_learner('.', 'export_1_freesound.pkl')
curated_learn.model = curated_learn.model[:-1]'curated_weights')
learn = cnn_learner(data, models.resnet34, metrics=error_rate)

but obviously I don’t know what I’m doing since the last command gives error:

RuntimeError: Error(s) in loading state_dict for Sequential:
	Missing key(s) in state_dict: "1.2.weight", "1.2.bias", 
"1.2.running_mean", "1.2.running_var", "1.4.weight", 
"1.4.bias", "1.6.weight", "1.6.bias", "1.6.running_mean", 
"1.6.running_var", "1.8.weight", "1.8.bias". 

I’ve actually spent over a day on this before posting and now I accidentally got somewhere. Still have no clue if this makes sense all I know is I can continue. So any tips on what the hell is going on would be appreciated.

curated_learn = load_learner('.', 'export_1_freesound.pkl')
curated_learn.model[-1][-1]=nn.Linear(in_features=512, out_features=1030, bias=True)'curated_weights')
learn = cnn_learner(data, models.resnet34, metrics=error_rate)

To be honest, I did not read your posts in detail. But here’s a simple hint.

learn.load(file) must load back into exactly the same model as the that created the file. You can’t load saved weights into a different model structure. (Without a lot of hacking.)

You will need to load curated_weights back into curated_learn, and then slice and dice these trained layers of curated_learn to make the next model.

HTH, and please excuse if this reply is off the mark.


It’s not off the mark. but how is this done? :slight_smile:

This load_diff_pretrained method by viraat for loading non-ImageNet pretrained weights might help.

1 Like

I’m not sure what the “this” refers to. It looks like you try to load the weights from curated_model into a different model structure created by cnn_learner. This will not work.

You have to load the trained, saved weights back into a Learner with the identical model that saved them. Now you have a model with its pretrained trained layers. You then construct the model you want from those layers.

A new model is constructed using PyTorch. Here’s an example:

(To design and construct a new model requires study and practice - at least I had to make a lot of mistakes!)

Now you can replace learn.model in an existing Learner, or create a new Learner from the DataBunch and the new model.

I am not sure what Learner export() and load_learner() are doing for you. I have never used them - only used Learner save and load for saving and restoring weights.

If you are not trying to construct a custom model, but only want to save/restore a Learner to continue training, use…later, construct curated_learn exactly as it was made originally, curated_learn.load(), and continue.

Good luck!

Hmm, interesting, I will give this a try - thanks.

Thanks for the link. I did find that before, but I have no idea what it’s doing or why, so hard to reason myself through that.

It’s basically the same as in the end of the day.

I’m not sure what you mean by a custom model. What I need is to learn with one dataset a whole bunch, then learn some more with another. Just like resnet34 is my first layer, then dataset1 is second and dataset2 is third. Each layer gets more specific to the end problem.

I would like to help, but I do not understand what you are trying to do.