I’m using an audio dataset from kaggle with two sets: curated and noisy (i could simply combine these, but wanna do it properly). I managed to use cnn to learn the curated set on the audio spectrograms, and I exported the results. Day later I start my jupiter and want to load this export - here’s how far I get:
data = ImageDataBunch.from_csv(
curated_learn = load_learner('.', 'export_1_freesound.pkl')
curated_learn.save('curated_weights') # dunno how to skip this
learn = cnn_learner(data, models.resnet34, metrics=error_rate)
And this gives error
RuntimeError: Error(s) in loading state_dict for Sequential:
size mismatch for 1.8.weight: copying a param with shape torch.Size([195, 512]) from checkpoint, the shape in current model is torch.Size([1030, 512]).
size mismatch for 1.8.bias: copying a param with shape torch.Size() from checkpoint, the shape in current model is torch.Size().
I tore this forum apart and it was only mentioned that I’m supposed to remove the last layer of the learned model, but I’m entirely not sure how this works or why this is necessary. In the course I’ve finished lesson 3.
# Name Version Build Channel
fastai 1.0.58 1 fastai
I’ve actually spent over a day on this before posting and now I accidentally got somewhere. Still have no clue if this makes sense all I know is I can continue. So any tips on what the hell is going on would be appreciated.
I’m not sure what the “this” refers to. It looks like you try to load the weights from curated_model into a different model structure created by cnn_learner. This will not work.
You have to load the trained, saved weights back into a Learner with the identical model that saved them. Now you have a model with its pretrained trained layers. You then construct the model you want from those layers.
(To design and construct a new model requires study and practice - at least I had to make a lot of mistakes!)
Now you can replace learn.model in an existing Learner, or create a new Learner from the DataBunch and the new model.
I am not sure what Learner export() and load_learner() are doing for you. I have never used them - only used Learner save and load for saving and restoring weights.
If you are not trying to construct a custom model, but only want to save/restore a Learner to continue training, use curated_learn.save()…later, construct curated_learn exactly as it was made originally, curated_learn.load(), and continue.
Hmm, interesting, I will give this a try - thanks.
Thanks for the link. I did find that before, but I have no idea what it’s doing or why, so hard to reason myself through that.
It’s basically the same as learn.save in the end of the day.
I’m not sure what you mean by a custom model. What I need is to learn with one dataset a whole bunch, then learn some more with another. Just like resnet34 is my first layer, then dataset1 is second and dataset2 is third. Each layer gets more specific to the end problem.