Loading pretrained models in fastaiV1 (Ulmfit)

StatisticDean · October 12, 2018, 8:50am

Hi, I would like to load my previously pretrained model in fastaiV1. However, fastai wants to load a single .pth file while i have 2 files (lm.h5 and lm_enc.h5). What should I do to be able to load my model using fastaiV1?

wgpubs · October 12, 2018, 8:10pm

I think you’re going to have to retrain your LM on the new v1 stack. Not sure, but I think the model has been updated.

crcrpar · October 13, 2018, 6:41am

Did you check this doc? fastaiV1 provides model download function.
http://docs.fast.ai/text.html#Fine-tuning-a-language-model

StatisticDean · October 13, 2018, 11:45am

I did check this documentation. My issue is not to download a model. My dataset is in french so i pretrained a lm using the old imdb scripts on french wikipedia. I would like to use this model again and not have to train it again (since it takes a long time to train and as such costs money to train). So the save of my old model is in the form of three files : lm_enc.h5 lm.h5 and itos.pkl . The importing model function from fastai takes 2 arguments, one .pth file with the model and the itos.pkl . I would like to be able to convert my two .h5 files in the right format to be able to use my model. And i can’t find any info on the doc as to how.

If you look at the wt103 model that’s used in the ulmfit article, when you download it from the fastai courses, you also get two .h5 file, but if you download it from link in the new documentation, you get one .pth file, so I assume it’s possible to get from one format to another, and i would just like help on how to proceed.

crcrpar · October 13, 2018, 2:00pm

Sorry for my misunderstanding.

Roughly speaking, lstm_wt103.pth is a counterpart of lm.h5.
So first thing to do is change the extension to pth from h5.

However, fastaiV1 ULMFit requires passed weight dictionary to have 1.decoder.bias key. But previously provided weight dictionary does not have 1.decoder.bias key.
So there would be a need to add this key to your weights somehow.
But, at this moment, I do not have any idea how to do this.

The keys of lm.h5 and lstm_wt103.pth are below:

# lm.h5
 '0.encoder.weight',
 '0.encoder_with_dropout.embed.weight',
 '0.rnns.0.module.weight_ih_l0',
 '0.rnns.0.module.bias_ih_l0',
 '0.rnns.0.module.bias_hh_l0',
 '0.rnns.0.module.weight_hh_l0_raw',
 '0.rnns.1.module.weight_ih_l0',
 '0.rnns.1.module.bias_ih_l0',
 '0.rnns.1.module.bias_hh_l0',
 '0.rnns.1.module.weight_hh_l0_raw',
 '0.rnns.2.module.weight_ih_l0',
 '0.rnns.2.module.bias_ih_l0',
 '0.rnns.2.module.bias_hh_l0',
 '0.rnns.2.module.weight_hh_l0_raw',
 '1.decoder.weight'

# lstm_wt103.pth
 '0.encoder.weight',
 '0.encoder_dp.emb.weight',
 '0.rnns.0.weight_hh_l0_raw',
 '0.rnns.0.module.weight_ih_l0',
 '0.rnns.0.module.weight_hh_l0',
 '0.rnns.0.module.bias_ih_l0',
 '0.rnns.0.module.bias_hh_l0',
 '0.rnns.1.weight_hh_l0_raw',
 '0.rnns.1.module.weight_ih_l0',
 '0.rnns.1.module.weight_hh_l0',
 '0.rnns.1.module.bias_ih_l0',
 '0.rnns.1.module.bias_hh_l0',
 '0.rnns.2.weight_hh_l0_raw',
 '0.rnns.2.module.weight_ih_l0',
 '0.rnns.2.module.weight_hh_l0',
 '0.rnns.2.module.bias_ih_l0',
 '0.rnns.2.module.bias_hh_l0',
 '1.decoder.weight',
 '1.decoder.bias'

zubair1.shah · October 16, 2018, 12:20am

A function which can be used to convert .h5 model to .pth model. This might help someone here.

def convert(path_to_old_model, path_to_save_converted_model):
    """
    path_to_old_model is the path to old model 
    and 
    path_to_save_converted_model is the path where the converted model is stored
    """
    old_wgts = torch.load(path_to_old_model, map_location=lambda storage, loc: storage)
    new_wgts = OrderedDict()
    new_wgts['encoder.weight']=old_wgts['0.encoder.weight']
    new_wgts['encoder_dp.emb.weight']=old_wgts['0.encoder_with_dropout.embed.weight']
    new_wgts['rnns.0.weight_hh_l0_raw']=old_wgts['0.rnns.0.module.weight_hh_l0_raw']
    new_wgts['rnns.0.module.weight_ih_l0']=old_wgts['0.rnns.0.module.weight_ih_l0']
    new_wgts['rnns.0.module.weight_hh_l0']=old_wgts['0.rnns.0.module.weight_hh_l0_raw']
    new_wgts['rnns.0.module.bias_ih_l0']=old_wgts['0.rnns.0.module.bias_ih_l0']
    new_wgts['rnns.0.module.bias_hh_l0']=old_wgts['0.rnns.0.module.bias_hh_l0']
    new_wgts['rnns.1.weight_hh_l0_raw']=old_wgts['0.rnns.1.module.weight_hh_l0_raw']
    new_wgts['rnns.1.module.weight_ih_l0']=old_wgts['0.rnns.1.module.weight_ih_l0']
    new_wgts['rnns.1.module.weight_hh_l0']=old_wgts['0.rnns.1.module.weight_hh_l0_raw']
    new_wgts['rnns.1.module.bias_ih_l0']=old_wgts['0.rnns.1.module.bias_ih_l0']
    new_wgts['rnns.1.module.bias_hh_l0']=old_wgts['0.rnns.1.module.bias_hh_l0']
    new_wgts['rnns.2.weight_hh_l0_raw']=old_wgts['0.rnns.2.module.weight_hh_l0_raw']
    new_wgts['rnns.2.module.weight_ih_l0']=old_wgts['0.rnns.2.module.weight_ih_l0']
    new_wgts['rnns.2.module.weight_hh_l0']=old_wgts['0.rnns.2.module.weight_hh_l0_raw']
    new_wgts['rnns.2.module.bias_ih_l0']=old_wgts['0.rnns.2.module.bias_ih_l0']
    new_wgts['rnns.2.module.bias_hh_l0']=old_wgts['0.rnns.2.module.bias_hh_l0']

    torch.save(new_wgts, path_to_save_converted_model+'converted_model.pth')

fredguth · October 22, 2018, 12:17pm

Thanks for that!

fredguth · October 22, 2018, 12:19pm

Before seeing @zubair1.shah answer, I decided to retrain my network from scratch in V1.
When It finished, I saved the model, creating a pth file.

But to fine tune this model with another corpus I need not only the model but also the itos. Where is the itos saved?

sam2 · October 24, 2018, 3:23pm

Here:

http://files.fast.ai/models/wt103/

sam2 · October 26, 2018, 11:12pm

@fredguth,

have tried fastai V1 to create a language model to predict next word? (or next n words)

Thanks in advance

fredguth · October 27, 2018, 12:24am

yes. exactly that.

sam2 · October 27, 2018, 1:24pm

I am struggling with that. Can you share your notebook?

Specifically the line where the model is defined (maybe you used something like learn = RNNLearner.language_model(data, pretrained_model=URLs.WT103, drop_mult=0.5))

Secondly, how do you get prediction of next word given input ‘Hello, I am Fred’ ???

Thank you in anticipation

gireesh4manu · November 16, 2018, 3:43am

In your 2nd paragraph, u say “if you download it from link in the new documentation, you get one .pth file”. Can you please share the link here? cos I’m looking for that .pth file which has the pretrained weights for wikitext103.
Thank you.

gireesh4manu · November 16, 2018, 3:49am

Hi Zubair,

first up, Thank you for your function. it worked for me. But I have a couple of questions here that I would like more clarity on.

may I ask what the purpose of mapping old_wgts[‘0.rnns.0.module.weight_hh_l0_raw’] to both new_wgts[‘rnns.0.weight_hh_l0_raw’] and new_wgts[‘rnns.0.module.weight_hh_l0’]?

why are we not mapping old_wgts[‘1.decoder.weight’] to new_wgts[‘1.decoder.weight’]? and why have we completely ignored ‘1.decoder.bias’?

Thank you.

IamAri · November 19, 2018, 7:44am

Hi @fredguth , when you say ‘saved the model’. How do you do this in V1? Is it the encoding layer that you save? Thanks in advance.

fredguth · November 19, 2018, 4:49pm

learn.save and learn.load

zubair1.shah · November 19, 2018, 11:05pm

I wrote this function based on the instructions provided by @sgugger. So, I think he might be the right person to answer your questions.

angelinayy · November 20, 2018, 1:43am

hi, did you get the link? I’m also looking for the .pth file with the 1.decoder.bias from WT103.

thank you!

angelinayy · November 20, 2018, 6:28am

hi Sam,

did you get any response? can i ask how it’s solved please? I have the same question.

what can replace the URL?: pretrained_model=URLs.WT103
thank you!

gireesh4manu · November 20, 2018, 6:36am

@zubair1.shah, I see. Thank you. Will check with the respective user