Loading a saved model

FYI: Saving a model with precompute=True is unlikely to ever be what you want, since that stage completes so quickly.

2 Likes

FYI: Saving a model with precompute=True is unlikely to ever be what you want, since that stage completes so quickly.

@jeremy I did not understand this.

first thing we do is,
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(resnet34, sz), bs=128, num_workers=1)
learn = ConvLearner.pretrained(resnet34, data, precompute=True)

if we are to save the model, it will be with precompute=True, correct?

if my understanding is correct, we should save the models after we finetune the models with precompute=False.

Thanks

1 Like

Yes, you could save it then, with precompute=True. But it takes <10 secs to train in that case, so there’s no point saving it!

3 Likes

Related question, loading doesn’t work when you turn off/on an EC2. Is it so by design? How could I save my work for the following day if that is the case.

thank you!

Just to check,

I restarted my kernel and loaded the model straight, however, I noticed that the prediction output were different from what I tested (before restarting the kernel).

May I know if this is normal.

How do I load pretrained model using fastai implementation over PyTorch? Like in SkLearn I can use pickle to dump a model in file then load and use later. I’ve use .load() method after declaring learn instance like bellow to load previously saved weights:

arch=resnet34
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz))
learn = ConvLearner.pretrained(arch, data, precompute=False)
learn.load('resnet34_test')

Then to predict the class of an image:

trn_tfms, val_tfms = tfms_from_model(arch,100)
img = open_image('circle/14.png')
im = val_tfms(img)
preds = learn.predict_array(im[None])
print(np.argmax(preds))

But It gets me the error:

ValueError: Expected more than 1 value per channel when training, got input size [1, 1024]

This code works if I use learn.fit(0.01, 3) instead of learn.load(). What I really want is to avoid the training step In my application.

2 Likes

Hi Steve,

This happened to me too. The accuracy was lower after I reloaded the model.

Have you figured how the reason behind it?

1 Like

Were you able to figure it out ?

I am currently following this notebook for the Planet_Amazon kaggle challenge
I save my weights after training the model for 64x64 sized images. How do I load them properly ? I keep getting this error

How do I load my weights properly ?
My code is
learn = ConvLearner.pretrained(f_model, data, metrics=metrics)

lrf = learn.lr_find()
learn.sched.plot()

lr=0.2
data = get_data(64, 64) #data generator for batch size=64, image size=64x64
learn = ConvLearner.pretrained(f_model, data, metrics=metrics)
learn.fit(lr, 3, cycle_len=1, cycle_mult=2)

learn.sched.plot_lr()
learn.load('After_size_64x64')

In actual you may just need this. Can you pls. try:

data = get_data(64, 64) #data generator for batch size=64, image size=64x64
learn = ConvLearner.pretrained(f_model, data, metrics=metrics)
learn.load(‘After_size_64x64’)

You do not need to fit or find lr since weights are already found. In case you need to refine model further, then define lr as:
lr=0.2

Pls. try and let me know.

1 Like

Thank you for your reply. I tried your advice and it definitely works for 64x64 images but when I save my weights after completing all the training upto size 256x256 and save my model, I get the same error as above.

Are you changing your data size like this?

data = get_data(256, 256) #data generator for batch size=64, image size=64x64
learn = ConvLearner.pretrained(f_model, data, metrics=metrics)
learn.load(‘After_size_250X256’)

I used 256x64 as my get_data parameters. Also, when I set my data to this and tried to recreate the learn object, it starts to train on a few epochs and the kernel dies down

1 Like

I finally solved the problem. I had precompute=True set when it shouldn’t have been. Thank you for the help !

1 Like

Awesome!

You figured out it correctly, precmpute=True loads weights from cache. Since you’ll explicitly be loading best weights, you do not want to precompute.

2 Likes

Is there any way to load the weights from a multi-label model to a singel-label model?

what image size to use when loading a model which was trained on increasing sizes using learn.set_data(set_data(sz, bs))

I am following lesson 4 for text classification. I am using pretrained language model for this.
When I am loading encoder I am getting this error: @ramesh @jeremy @vikbehal

m3.load_encoder(f'adam1_enc')

     RuntimeError                              Traceback (most recent call last)
~/anaconda2/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
    513                 try:
--> 514                     own_state[name].copy_(param)
    515                 except Exception:

RuntimeError: invalid argument 2: sizes do not match at /opt/conda/conda-bld/pytorch_1518244421288/work/torch/lib/THC/generic/THCTensorCopy.c:51

During handling of the above exception, another exception occurred:

RuntimeError                              Traceback (most recent call last)
<ipython-input-17-f550326d29d6> in <module>
      1 # this notebook has a mess of some things going under 'all/' others not, so a little hack here
      2 #!ln -sf ../all/models/adam3_20_enc.h5 models/adam3_20_enc.h5
----> 3 m3.load_encoder(f'adam1_enc')
      4 m3.clip=25.
      5 lrs=np.array([1e-4,1e-3,1e-3,1e-2,3e-2])

~/Downloads/fastai/courses/dl1/fastai/nlp.py in load_encoder(self, name)
    164     def save_encoder(self, name): save_model(self.model[0], self.get_model_path(name))
    165 
--> 166     def load_encoder(self, name): load_model(self.model[0], self.get_model_path(name))
    167 
    168 

~/Downloads/fastai/courses/dl1/fastai/torch_imports.py in load_model(m, p)
     38             if n+'_raw' not in sd: sd[n+'_raw'] = sd[n]
     39             del sd[n]
---> 40     m.load_state_dict(sd)
     41 
     42 def load_pre(pre, f, fn):

~/anaconda2/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
    517                                        'whose dimensions in the model are {} and '
    518                                        'whose dimensions in the checkpoint are {}.'
--> 519                                        .format(name, own_state[name].size(), param.size()))
    520             elif strict:
    521                 raise KeyError('unexpected key "{}" in state_dict'

RuntimeError: While copying the parameter named encoder.weight, whose dimensions in the model are torch.Size([67979, 300]) and whose dimensions in the checkpoint are torch.Size([21821, 300]).

You need to get_data of size 64 bits first. The size of the images in data module while loading the module should be same as when saving the model.

does saved model also contain loss and metrics recorded during training? i mean can i use plot.recorder.plot_losses after loading the model?
cause i am using colab and loading and disconnecting causes issues.