Loading pretrained models in fastaiV1 (Ulmfit)

Hi, I would like to load my previously pretrained model in fastaiV1. However, fastai wants to load a single .pth file while i have 2 files (lm.h5 and lm_enc.h5). What should I do to be able to load my model using fastaiV1?

I think you’re going to have to retrain your LM on the new v1 stack. Not sure, but I think the model has been updated.

Did you check this doc? fastaiV1 provides model download function.
http://docs.fast.ai/text.html#Fine-tuning-a-language-model

I did check this documentation. My issue is not to download a model. My dataset is in french so i pretrained a lm using the old imdb scripts on french wikipedia. I would like to use this model again and not have to train it again (since it takes a long time to train and as such costs money to train). So the save of my old model is in the form of three files : lm_enc.h5 lm.h5 and itos.pkl . The importing model function from fastai takes 2 arguments, one .pth file with the model and the itos.pkl . I would like to be able to convert my two .h5 files in the right format to be able to use my model. And i can’t find any info on the doc as to how.

If you look at the wt103 model that’s used in the ulmfit article, when you download it from the fastai courses, you also get two .h5 file, but if you download it from link in the new documentation, you get one .pth file, so I assume it’s possible to get from one format to another, and i would just like help on how to proceed.

1 Like

Sorry for my misunderstanding.

Roughly speaking, lstm_wt103.pth is a counterpart of lm.h5.
So first thing to do is change the extension to pth from h5.

However, fastaiV1 ULMFit requires passed weight dictionary to have 1.decoder.bias key. But previously provided weight dictionary does not have 1.decoder.bias key.
So there would be a need to add this key to your weights somehow.
But, at this moment, I do not have any idea how to do this.

The keys of lm.h5 and lstm_wt103.pth are below:

# lm.h5
 '0.encoder.weight',
 '0.encoder_with_dropout.embed.weight',
 '0.rnns.0.module.weight_ih_l0',
 '0.rnns.0.module.bias_ih_l0',
 '0.rnns.0.module.bias_hh_l0',
 '0.rnns.0.module.weight_hh_l0_raw',
 '0.rnns.1.module.weight_ih_l0',
 '0.rnns.1.module.bias_ih_l0',
 '0.rnns.1.module.bias_hh_l0',
 '0.rnns.1.module.weight_hh_l0_raw',
 '0.rnns.2.module.weight_ih_l0',
 '0.rnns.2.module.bias_ih_l0',
 '0.rnns.2.module.bias_hh_l0',
 '0.rnns.2.module.weight_hh_l0_raw',
 '1.decoder.weight'

# lstm_wt103.pth
 '0.encoder.weight',
 '0.encoder_dp.emb.weight',
 '0.rnns.0.weight_hh_l0_raw',
 '0.rnns.0.module.weight_ih_l0',
 '0.rnns.0.module.weight_hh_l0',
 '0.rnns.0.module.bias_ih_l0',
 '0.rnns.0.module.bias_hh_l0',
 '0.rnns.1.weight_hh_l0_raw',
 '0.rnns.1.module.weight_ih_l0',
 '0.rnns.1.module.weight_hh_l0',
 '0.rnns.1.module.bias_ih_l0',
 '0.rnns.1.module.bias_hh_l0',
 '0.rnns.2.weight_hh_l0_raw',
 '0.rnns.2.module.weight_ih_l0',
 '0.rnns.2.module.weight_hh_l0',
 '0.rnns.2.module.bias_ih_l0',
 '0.rnns.2.module.bias_hh_l0',
 '1.decoder.weight',
 '1.decoder.bias'

A function which can be used to convert .h5 model to .pth model. This might help someone here.

def convert(path_to_old_model, path_to_save_converted_model):
    """
    path_to_old_model is the path to old model 
    and 
    path_to_save_converted_model is the path where the converted model is stored
    """
    old_wgts = torch.load(path_to_old_model, map_location=lambda storage, loc: storage)
    new_wgts = OrderedDict()
    new_wgts['encoder.weight']=old_wgts['0.encoder.weight']
    new_wgts['encoder_dp.emb.weight']=old_wgts['0.encoder_with_dropout.embed.weight']
    new_wgts['rnns.0.weight_hh_l0_raw']=old_wgts['0.rnns.0.module.weight_hh_l0_raw']
    new_wgts['rnns.0.module.weight_ih_l0']=old_wgts['0.rnns.0.module.weight_ih_l0']
    new_wgts['rnns.0.module.weight_hh_l0']=old_wgts['0.rnns.0.module.weight_hh_l0_raw']
    new_wgts['rnns.0.module.bias_ih_l0']=old_wgts['0.rnns.0.module.bias_ih_l0']
    new_wgts['rnns.0.module.bias_hh_l0']=old_wgts['0.rnns.0.module.bias_hh_l0']
    new_wgts['rnns.1.weight_hh_l0_raw']=old_wgts['0.rnns.1.module.weight_hh_l0_raw']
    new_wgts['rnns.1.module.weight_ih_l0']=old_wgts['0.rnns.1.module.weight_ih_l0']
    new_wgts['rnns.1.module.weight_hh_l0']=old_wgts['0.rnns.1.module.weight_hh_l0_raw']
    new_wgts['rnns.1.module.bias_ih_l0']=old_wgts['0.rnns.1.module.bias_ih_l0']
    new_wgts['rnns.1.module.bias_hh_l0']=old_wgts['0.rnns.1.module.bias_hh_l0']
    new_wgts['rnns.2.weight_hh_l0_raw']=old_wgts['0.rnns.2.module.weight_hh_l0_raw']
    new_wgts['rnns.2.module.weight_ih_l0']=old_wgts['0.rnns.2.module.weight_ih_l0']
    new_wgts['rnns.2.module.weight_hh_l0']=old_wgts['0.rnns.2.module.weight_hh_l0_raw']
    new_wgts['rnns.2.module.bias_ih_l0']=old_wgts['0.rnns.2.module.bias_ih_l0']
    new_wgts['rnns.2.module.bias_hh_l0']=old_wgts['0.rnns.2.module.bias_hh_l0']

    torch.save(new_wgts, path_to_save_converted_model+'converted_model.pth')
5 Likes

Thanks for that!

Before seeing @zubair1.shah answer, I decided to retrain my network from scratch in V1.
When It finished, I saved the model, creating a pth file.

But to fine tune this model with another corpus I need not only the model but also the itos. Where is the itos saved?

1 Like

Here:

http://files.fast.ai/models/wt103/

@fredguth,

have tried fastai V1 to create a language model to predict next word? (or next n words)

Thanks in advance

yes. exactly that.

I am struggling with that. Can you share your notebook?

Specifically the line where the model is defined (maybe you used something like learn = RNNLearner.language_model(data, pretrained_model=URLs.WT103, drop_mult=0.5))

Secondly, how do you get prediction of next word given input ‘Hello, I am Fred’ ???

Thank you in anticipation

In your 2nd paragraph, u say “if you download it from link in the new documentation, you get one .pth file”. Can you please share the link here? cos I’m looking for that .pth file which has the pretrained weights for wikitext103.
Thank you.

Hi Zubair,

first up, Thank you for your function. it worked for me. But I have a couple of questions here that I would like more clarity on.

may I ask what the purpose of mapping old_wgts[‘0.rnns.0.module.weight_hh_l0_raw’] to both new_wgts[‘rnns.0.weight_hh_l0_raw’] and new_wgts[‘rnns.0.module.weight_hh_l0’]?

why are we not mapping old_wgts[‘1.decoder.weight’] to new_wgts[‘1.decoder.weight’]? and why have we completely ignored ‘1.decoder.bias’?

Thank you.

Hi @fredguth , when you say ‘saved the model’. How do you do this in V1? Is it the encoding layer that you save? Thanks in advance.

learn.save and learn.load

I wrote this function based on the instructions provided by @sgugger. So, I think he might be the right person to answer your questions.

hi, did you get the link? I’m also looking for the .pth file with the 1.decoder.bias from WT103.

thank you!

hi Sam,

did you get any response? can i ask how it’s solved please? I have the same question.

what can replace the URL?: pretrained_model=URLs.WT103
thank you!

@zubair1.shah, I see. Thank you. Will check with the respective user