i am having similar issue despite training both encoder and classifier with same dataset . i do not understand what is missing.My input csv has about 12k rows
trace:
RuntimeError Traceback (most recent call last)
in ()
1 learn = text_classifier_learner(data_clas, drop_mult=0.5)
----> 2 learn.load_encoder(‘fine_tuned_enc’)
3 learn.freeze()
4 learn.fit_one_cycle(1, slice(5e-3/2., 5e-3))/usr/local/lib/python3.6/dist-packages/fastai/text/learner.py in load_encoder(self, name)
61 def load_encoder(self, name:str):
62 “Load the encodername
from the model directory.”
—> 63 get_model(self.model)[0].load_state_dict(torch.load(self.path/self.model_dir/f’{name}.pth’))
64 self.freeze()
65/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
767 if len(error_msgs) > 0:
768 raise RuntimeError(‘Error(s) in loading state_dict for {}:\n\t{}’.format(
→ 769 self.class.name, “\n\t”.join(error_msgs)))
770
771 def _named_members(self, get_members_fn, prefix=‘’, recurse=True):RuntimeError: Error(s) in loading state_dict for MultiBatchRNNCore:
size mismatch for encoder.weight: copying a param with shape torch.Size([7122, 400]) from checkpoint, the shape in current model is torch.Size([7140, 400]).
size mismatch for encoder_dp.emb.weight: copying a param with shape torch.Size([7122, 400]) from checkpoint, the shape in current model is torch.Size([7140, 400]).
Reading data for classifier
data_clas = (TextList.from_df(file, ‘’, cols=‘original_doc’)
#Where are the inputs? Column ‘text’ of this csv
.random_split_by_pct()
#How to split it? Randomly with the default 20%
.label_from_df(cols=‘groundtruth’)
#Label it for a language model
.databunch())
reading data for LM
data_lm = (TextList.from_df(file, '', cols='original_doc') #Where are the inputs? Column 'text' of this csv .random_split_by_pct() #How to split it? Randomly with the default 20% .label_for_lm(cols='groundtruth') #Label it for a language model .databunch())