How to solve size mismatch error in NLP classifier?

joelngys · April 10, 2020, 7:54pm

I’m really new to Python and ML in general. I figured the best way to learn would be to practice analyzing datasets myself instead of just pressing shift enter in all the fast.ai lessons, so I’m following the documentation in docs.fast.ai/text to create a classifier for Amazon music instrument reviews. I’m hitting an error when I run the last line of code learn.load_encoder(‘ft_enc’),

The error is this:

I looked at the forum for similar issues before and found that I could manually change the size of the network with this line of code, replacing the 10616 with whatever number the error message spits out:

config = awd_lstm_clas_config.copy()
config[‘n_hid’] = 10616

But this involves me running the code, getting an error message, then remedying it. How do I change this to a variable so I can avoid this process of getting an error message and manually changing the size?

joelngys · April 11, 2020, 9:54pm

Anyone? I toyed around with the code a little bit more and found that while this

learn.save_encoder('ft_enc')
learn = text_classifier_learner(data_clas, AWD_LSTM, drop_mult=0.5)
learn.load_encoder('ft_enc')

produces the same error message

RuntimeError                              Traceback (most recent call last)
<ipython-input-206-d621e553d9a0> in <module>
      1 learn.save_encoder('ft_enc')
      2 learn = text_classifier_learner(data_clas, AWD_LSTM, drop_mult=0.5)
----> 3 learn.load_encoder('ft_enc')
      4 learn.save_encoder('ft_enc')
      5 learn = text_classifier_learner(data_clas, AWD_LSTM, drop_mult=0.5)

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/text/learner.py in load_encoder(self, name, device)
     69         if hasattr(encoder, 'module'): encoder = encoder.module
     70         distrib_barrier()
---> 71         encoder.load_state_dict(torch.load(self.path/self.model_dir/f'{name}.pth', map_location=device))
     72         self.freeze()
     73         return self

/opt/conda/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
    828         if len(error_msgs) > 0:
    829             raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
--> 830                                self.__class__.__name__, "\n\t".join(error_msgs)))
    831         return _IncompatibleKeys(missing_keys, unexpected_keys)
    832 

RuntimeError: Error(s) in loading state_dict for AWD_LSTM:
	size mismatch for encoder.weight: copying a param with shape torch.Size([10584, 400]) from checkpoint, the shape in current model is torch.Size([10520, 400]).
	size mismatch for encoder_dp.emb.weight: copying a param with shape torch.Size([10584, 400]) from checkpoint, the shape in current model is torch.Size([10520, 400]).

If I just run the same section of code again it ends up working. I’m really confused as to why this is the case, can anyone shed some light on this?

daveramseymusic · July 28, 2021, 3:53am

I’m having a similar problem and will check to see if this works for me. Thanks for asking this!