I am working on Kaggle Amazon mobile phone review data.
I loaded CSV file in pandas
TextFilefor language model
data_lm = (TextList.from_df(df, path, cols=4).random_split_by_pct(0.1).label_for_lm().databunch())
I then check vocab size which was
40405(Seems random though?)
Created learner, finetuned and saved encoder.
Created learner and when I tried to
loadthe encoder I received size mismatch error. Also vocab size was different this time.
RuntimeError: Error(s) in loading state_dict for MultiBatchRNNCore: size mismatch for encoder.weight:
So I am not able to load the encoder? Due to the different size of vocab? If yes then how can I solve the issue? Also I checked IMDB notebook learners and seems like in both the cases (LM file and classification) vocab size is greater than 60 thousand.
You can find the notebooks here in this repo