Error loading encoder for text classification when using config['bidir']==True

Naama_A · November 18, 2020, 3:45pm

Hi,
I’m using fastai version==‘1.0.61’.
I have an issue implementing ULMFit for text classification.
I trained a language model (LM) and saved the encoder.
Then when loading the encoder to the classifier (created on the same df_trn, df_val with vocab of the LM data), I’m able to load the encoder on the default config, but not when ‘bidir’ is changed to True.
I did not train the LM with config[‘bidir’]=False, as this would be incorrect, since LM predicts next word and should not “look backwards”, if my understanding is correct.

Here’s my code:
data_lm = TextLMDataBunch.from_df(train_df = df_trn, valid_df = df_val, path = “”)
(LM was previously trained on this data with default config for LM, encoder saved ‘my_encoder’)

data_clf = TextClasDataBunch.from_df(path = “”, train_df = df_trn, valid_df = df_val, vocab=data_lm.train_ds.vocab, bs=32)
config = awd_lstm_clas_config.copy()
config[‘bidir’]=True
learn_clf=text_classifier_learner(data_clf, AWD_LSTM, drop_mult=1.0, config=config)
learn_clf.load_encoder(‘my_encoder’)

Getting error on the last line of code:
RuntimeError: Error(s) in loading state_dict for AWD_LSTM:
Missing key(s) in state_dict: “rnns.0.module.weight_ih_l0_reverse”, “rnns.0.module.weight_hh_l0_reverse”, “rnns.0.module.bias_ih_l0_reverse”, “rnns.0.module.bias_hh_l0_reverse”, “rnns.1.module.weight_ih_l0_reverse”, “rnns.1.module.weight_hh_l0_reverse”, “rnns.1.module.bias_ih_l0_reverse”, “rnns.1.module.bias_hh_l0_reverse”, “rnns.2.module.weight_ih_l0_reverse”, “rnns.2.module.weight_hh_l0_reverse”, “rnns.2.module.bias_ih_l0_reverse”, “rnns.2.module.bias_hh_l0_reverse”.
and then size mismatch errors.

if config[‘bidir’]=False, I don’t get the error.
I guess I understand the logic. with config[‘bidir’]=True changed the architecture and maybe tokenization changes?
What I don’t understand is how it worked in the past and now it doesn’t.
Is there a way to train bidir LSTM using my encoder trained on the same data?
what am I missing?

Tried everything in this thread but does not seem to work. In the very end of it I saw someone asked the same question, with no reply.

Thanks for your help!

JakobV · November 28, 2020, 3:13pm

Don’t know if this help you but this works for me:

# Set up at learner with Bi-LSTM architecture
config = awd_lstm_clas_config.copy()
config.update({'bidir':True})

learn = text_classifier_learner(dls, 
                                AWD_LSTM, 
                                drop_mult=0.5,
                                metrics=accuracy, 
                                pretrained=False,
                                config=config
                               )

Naama_A · December 9, 2020, 1:57pm

Thanks.
I eventually found a post where they said bidir does not work for pretrained models anymore. So I guess that was the issue.
I worked on a pretrained model