How to load trained model as pytorch model and predict

ajan1019 · November 26, 2018, 4:23am

I used below code to load fastai model as pytorch model.
But recently I upgraded my fastai version and it throws some error which I didn’t face in the past.

# these parameters aren’t used, but this is the easiest way to get a model
bptt, em_sz, nh, nl = 70, 400, 1150, 3
drop_out = np.array([0.4, 0.5, 0.05, 0.3, 0.4]) 0.5
drop_mult = 1.
dps = drop_out drop_mult
ps = [0.1]
ps = [dps[4]] + ps
num_classes = 3 # this is the number of classes we want to predict

lin_ftrs = [50]
layer = [em_sz * 3] + lin_ftrs + [num_classes]

vs = len(self.tokenizer)

self.model = get_rnn_classifier(bptt, 20*70, num_classes, vs, emb_sz=em_sz, n_hid=nh, n_layers=nl, pad_token=1,
layers=layer, drops=ps, weight_p=dps[1], embed_p=dps[2], hidden_p=dps[3])

self.model.load_state_dict(torch.load(os.path.join(dir_path, “model.pth”),
map_location=lambda storage, loc: storage))

Error is

RuntimeError: Error(s) in loading state_dict for SequentialRNN:
size mismatch for 0.encoder.weight: copying a param of torch.Size([5999, 400]) from checkpoint, where the shape is torch.Size([3699, 400]) in current model.
size mismatch for 0.encoder_dp.emb.weight: copying a param of torch.Size([5999, 400]) from checkpoint, where the shape is torch.Size([3699, 400]) in current model.

I could make sense of the error, But I don’t know how to tackle this error.
Help is appreciated, Thanks

sgugger · November 26, 2018, 12:49pm

You’re not using the same vocabulary as before (you have a size 5999 now and your model was trained with a size of 3699 according to your error message).

ajan1019 · November 28, 2018, 6:25am

Thanks for reply. @sgugger.

It is mistake from my side. I used different itos.pkl file. That resulted in error.

I have another query, previously fast.ai uses classes.txt file to map softmax output probability to class name.
now, as we add class name in texts.csv, how can we map softmax output probability to class name.

Example:

Previously my class file like this
negative
positive
neutral

if my soft max output is like this [0.9,0.05,0.05] I will map this to negative class.
Now, I don’t know how mapping is done in fast.ai. Please help me regarding this.
Help is appreciated.

sgugger · November 28, 2018, 2:35pm

When you call learn.predict() on a text, it will tell you the class, index and probabilities.

ajan1019 · November 28, 2018, 4:43pm

I want to use prediction in pytorch model, not in learner.

alvisanovari · November 30, 2018, 6:05am

I have a similar error. I want to create an empty databunch for inference and then load in my model:

data_bunch = (TextList.from_csv(path, csv_name='blank.csv')
    .random_split_by_pct()
    .label_for_lm() # this does the tokenization and numericalization
    .databunch(bs=10))

learn = language_model_learner(data_bunch, pretrained_model=None)

I just made a few dummy rows in that csv. There probably is a better way to do this. Then I get this error after trying to load the trained model using learn.load.

RuntimeError: Error(s) in loading state_dict for SequentialRNN:
size mismatch for 0.encoder.weight: copying a param of torch.Size([325, 400]) from checkpoint, where the shape is torch.Size([4, 400]) in current model.
size mismatch for 0.encoder_dp.emb.weight: copying a param of torch.Size([325, 400]) from checkpoint, where the shape is torch.Size([4, 400]) in current model.
size mismatch for 1.decoder.weight: copying a param of torch.Size([325, 400]) from checkpoint, where the shape is torch.Size([4, 400]) in current model.
size mismatch for 1.decoder.bias: copying a param of torch.Size([325]) from checkpoint, where the shape is torch.Size([4]) in current model.

alvisanovari · November 30, 2018, 6:46am

OK I think, like you said, passing in the vocab from the original language model databunch worked.

data_bunch = (TextList.from_csv(path, csv_name=‘blank.csv’, vocab=data_lm.vocab)
.random_split_by_pct()
.label_for_lm() # this does the tokenization and numericalization
.databunch(bs=10))

ajan1019 · December 5, 2018, 4:15am

Got my answer. Order of classes is in classes.txt file inside tmp folder.

This folder created when we use

data_lm.save()
data_clas.save()