ULMFit without finetuning

cudawarped · December 6, 2018, 5:30pm

Hi,

I am trying to use a language model which I have trained on wiki-103 to perform classification.
Everything appears to be working well if I follow the imdb example, fine tune the language model and load its encoder into the classification model. I get an f1 score of ~84%.

I then wanted to see what the result was if I didn’t fine tune the language model, that is if I use the wiki-103 language model directly. I tried loading it with

learn = text_classifier_learner(data_class, drop_mult=0.5)
learn.load_pretrained(‘best_model.pth’,‘dict.pkl’)

with the wiki-103 model trained using data_lm and saved as

learn_lm.save(‘best_model’)

and the dict saved as

pickle_out = open(path_lang_model/‘models/dict.pkl’,“wb”)
pickle.dump(data_lm.vocab.itos, pickle_out)
pickle_out.close()

but I get the following error, so I am assuming this is not the correct thing to do

RuntimeError: Error(s) in loading state_dict for SequentialRNN:
	Missing key(s) in state_dict: "1.layers.0.weight", "1.layers.0.bias", "1.layers.0.running_mean", "1.layers.0.running_var", "1.layers.2.weight", "1.layers.2.bias", "1.layers.4.weight", "1.layers.4.bias", "1.layers.4.running_mean", "1.layers.4.running_var", "1.layers.6.weight", "1.layers.6.bias". 
	Unexpected key(s) in state_dict: "1.decoder.weight", "1.decoder.bias".

I then tried just loading the encoder but this fails because the size of the vocab is different, 30,004 for the language model and only 900 in my classification task.
What is the correct way to load a non fine tuned language model into a text_classifier_learner()?

If I don’t load anything then I still get an f1 score of ~80%. I am confused as to what is happening if the encoder is not loaded. If the model just uses default initialization then it appears to do pretty well. If the encoder is not loaded into the text_classifier_learner() I am assuming I am not using transfer learning, is this correct?

Serbulent · March 1, 2019, 6:52am

Could you solve the issue I get same error when loading wikitext pretrained model.

cudawarped · March 1, 2019, 11:32am

Are you trying to load the wiki103 fastai english model into a text_classifier_learner directly without fine tuning or are you getting this error in the imdb notebook?

If it is the former, and you have your classification data loaded in a similar way to the following

txt_proc = [TokenizeProcessor(tokenizer=Tokenizer(lang='en') ),NumericalizeProcessor()]
data_class_lm = (TextList.from_df(df,path_class,cols=1,processor=txt_proc)
     .split_by_idxs(trn_idx,val_idx)
     .label_for_lm()
     .databunch(bs=bs,num_workers=0))

you should be able to load the default model, map it to your vocabulary inside data_class_lm and save the mapped encoder as below.

learn = language_model_learner(data_class_lm,AWD_LSTM, drop_mult=0.3)
learn.save_encoder('wk103_en_enc')

Then you should be able to create a text_classifier_learner for the same data and load the encoder.

data_class = (TextList.from_df(df,path_class,cols=1,vocab=data_class_lm.train_ds.vocab,processor = txt_proc)
    .split_by_idxs(train_idx,valid_idx)
    .label_from_df(cols=2)
    .databunch(bs=bs,num_workers=0))
learn = text_classifier_learner(data_class,AWD_LSTM,drop_mult=0.5)
learn.load_encoder('wk103_en_enc')

Depending on the version of fastai you have this may not work exactly as above. On ver 1.0.46 the above works for me.

Serbulent · March 1, 2019, 2:22pm

Thanks for your reply I just try to figure out usage of NLP transfer learning using imdb example. I think I also find another solution as below;

preTrainedWt103Path = DATA_PATH/'models/wt103'
learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.5, pretrained=False)
learn.load_pretrained(wgts_fname = preTrainedWt103Path/'fwd_wt103.h5', itos_fname =preTrainedWt103Path/'itos_wt103.pkl', strict=False )