I used below code to load fastai model as pytorch model.
But recently I upgraded my fastai version and it throws some error which I didn’t face in the past.
# these parameters aren’t used, but this is the easiest way to get a model
bptt, em_sz, nh, nl = 70, 400, 1150, 3
drop_out = np.array([0.4, 0.5, 0.05, 0.3, 0.4]) 0.5
drop_mult = 1.
dps = drop_out drop_mult
ps = [0.1]
ps = [dps[4]] + ps
num_classes = 3 # this is the number of classes we want to predict
RuntimeError: Error(s) in loading state_dict for SequentialRNN:
size mismatch for 0.encoder.weight: copying a param of torch.Size([5999, 400]) from checkpoint, where the shape is torch.Size([3699, 400]) in current model.
size mismatch for 0.encoder_dp.emb.weight: copying a param of torch.Size([5999, 400]) from checkpoint, where the shape is torch.Size([3699, 400]) in current model.
I could make sense of the error, But I don’t know how to tackle this error.
Help is appreciated, Thanks
You’re not using the same vocabulary as before (you have a size 5999 now and your model was trained with a size of 3699 according to your error message).
It is mistake from my side. I used different itos.pkl file. That resulted in error.
I have another query, previously fast.ai uses classes.txt file to map softmax output probability to class name.
now, as we add class name in texts.csv, how can we map softmax output probability to class name.
Example:
Previously my class file like this
negative
positive
neutral
if my soft max output is like this [0.9,0.05,0.05] I will map this to negative class.
Now, I don’t know how mapping is done in fast.ai. Please help me regarding this.
Help is appreciated.
I have a similar error. I want to create an empty databunch for inference and then load in my model:
data_bunch = (TextList.from_csv(path, csv_name='blank.csv')
.random_split_by_pct()
.label_for_lm() # this does the tokenization and numericalization
.databunch(bs=10))
learn = language_model_learner(data_bunch, pretrained_model=None)
I just made a few dummy rows in that csv. There probably is a better way to do this. Then I get this error after trying to load the trained model using learn.load.
RuntimeError: Error(s) in loading state_dict for SequentialRNN:
size mismatch for 0.encoder.weight: copying a param of torch.Size([325, 400]) from checkpoint, where the shape is torch.Size([4, 400]) in current model.
size mismatch for 0.encoder_dp.emb.weight: copying a param of torch.Size([325, 400]) from checkpoint, where the shape is torch.Size([4, 400]) in current model.
size mismatch for 1.decoder.weight: copying a param of torch.Size([325, 400]) from checkpoint, where the shape is torch.Size([4, 400]) in current model.
size mismatch for 1.decoder.bias: copying a param of torch.Size([325]) from checkpoint, where the shape is torch.Size([4]) in current model.
OK I think, like you said, passing in the vocab from the original language model databunch worked.
data_bunch = (TextList.from_csv(path, csv_name=‘blank.csv’, vocab=data_lm.vocab)
.random_split_by_pct()
.label_for_lm() # this does the tokenization and numericalization
.databunch(bs=10))