Hi,
I’m using fastai version==‘1.0.61’.
I have an issue implementing ULMFit for text classification.
I trained a language model (LM) and saved the encoder.
Then when loading the encoder to the classifier (created on the same df_trn, df_val with vocab of the LM data), I’m able to load the encoder on the default config, but not when ‘bidir’ is changed to True.
I did not train the LM with config[‘bidir’]=False, as this would be incorrect, since LM predicts next word and should not “look backwards”, if my understanding is correct.
Here’s my code:
data_lm = TextLMDataBunch.from_df(train_df = df_trn, valid_df = df_val, path = “”)
(LM was previously trained on this data with default config for LM, encoder saved ‘my_encoder’)
data_clf = TextClasDataBunch.from_df(path = “”, train_df = df_trn, valid_df = df_val, vocab=data_lm.train_ds.vocab, bs=32)
config = awd_lstm_clas_config.copy()
config[‘bidir’]=True
learn_clf=text_classifier_learner(data_clf, AWD_LSTM, drop_mult=1.0, config=config)
learn_clf.load_encoder(‘my_encoder’)
Getting error on the last line of code:
RuntimeError: Error(s) in loading state_dict for AWD_LSTM:
Missing key(s) in state_dict: “rnns.0.module.weight_ih_l0_reverse”, “rnns.0.module.weight_hh_l0_reverse”, “rnns.0.module.bias_ih_l0_reverse”, “rnns.0.module.bias_hh_l0_reverse”, “rnns.1.module.weight_ih_l0_reverse”, “rnns.1.module.weight_hh_l0_reverse”, “rnns.1.module.bias_ih_l0_reverse”, “rnns.1.module.bias_hh_l0_reverse”, “rnns.2.module.weight_ih_l0_reverse”, “rnns.2.module.weight_hh_l0_reverse”, “rnns.2.module.bias_ih_l0_reverse”, “rnns.2.module.bias_hh_l0_reverse”.
and then size mismatch errors.
if config[‘bidir’]=False, I don’t get the error.
I guess I understand the logic. with config[‘bidir’]=True changed the architecture and maybe tokenization changes?
What I don’t understand is how it worked in the past and now it doesn’t.
Is there a way to train bidir LSTM using my encoder trained on the same data?
what am I missing?
Tried everything in this thread but does not seem to work. In the very end of it I saw someone asked the same question, with no reply.
Thanks for your help!