I need to fine-tune sequentially the AWD_LSTM language model on two data sets: first I fine-tune on the first data set, save the model and then continue fine-tuning on the second. I can’t mix them together as the second resides within a secure environment and I can’t bring new data it. The data sets are quite similar (from the same domain), but their vocabularies are different.
First, I trained as usual:
data_lm = (TextList.from_df(df_pretrain_data) .split_by_rand_pct(0.1) .label_for_lm() .databunch(bs=48)) len(data_lm.vocab.itos) ->60000 learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.5) learn.fit_one_cycle(1, 1e-2, moms=(0.8,0.7)) learn.unfreeze() learn.fit_one_cycle(10, 1e-3, moms=(0.8,0.7)) learn.save('fine_tuned') learn.save_encoder('fine_tuned_enc') learn.export()
When I move the save model to the secure env. and wanted to continue fine tuning, I received the following error:
data_lm_new = (TextList.from_df(df_pretrain_data_new) .split_by_rand_pct(0.1) .label_for_lm() .databunch(bs=48)) len(data_lm_new.vocab.itos) -> 4224 learn_new = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.5, pretrained=False).load('path/to/fine_tuned')
Error in loading state_dict for Sequential RNN:
Obviously, there is a problem that the data sets are different. What is the general solution to such a problem? I can’t find it on the forum I’m afraid. Thanks!