Loading pre-trained weights from a local file rather than from a URL

gdos · April 18, 2019, 9:37am

I’m following the quick overview at docs.fast.ai/text.html and got an SSLError of ‘certificate verify failed’ when running:

learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.5)

probably because of a bad interaction between my company firewall and the amazonaws SSL certificate management.

My question is:

Is there a way I can load the pre-trained weights from a local file rather than going through the amazonaws URL?

I’ve tried specifying the parameter pretrained_fnames as in this old topic but then got a “missing 1 required positional argument: ‘arch’” error, so it seems I need to specify the local file path directly in the value of the arch parameter somehow - alas, I couldn’t figure out how.

sgugger · April 18, 2019, 12:56pm

You still need to pass an architecture, which is a callbacle returning your model, then you can pass pretrained_fnames as a named argument. The full signature is in the docs.

gdos · April 18, 2019, 3:24pm

Thanks for your reply @sgugger. Adding AWD_LSTM triggers the download from amazonaws, throwing an exception in my case - because of firewall issues. So it seems the function language_model_learner never gets to run the if pretrained_fnames is not None statement.

sgugger · April 18, 2019, 3:25pm

That’s because you have to say pretrained=False if you don’t want to trigger the download.

gdos · April 18, 2019, 3:43pm

That did the job, thank you @sgugger.

DanL · June 25, 2019, 3:50pm

I’m running into exactly the same issue. Just to clarify, if I have downloaded the lstm_wt103.pth and itos_wt103.pkl files to .fastai/models is the following line of code correct or do I need to point to the location of these pretrained models using pretrained_fnames?

learn = language_model_learner(data_lm, AWD_LSTM, pretrained=False, drop_mult=0.3)

Thanks!

gdos · June 25, 2019, 4:16pm

You’ll have to also add the location using pretrained_fnames.

DanL · June 25, 2019, 4:21pm

Thanks, I’ve grabbed the pretrained model and dictionary from here: http://files.fast.ai/models/wt103_v1/ (no problems with my company firewall) but still haven’t had any joy when using pretrained_fnames= and pointing to the appropriate files. (size mismatch between model and dictionary apparently).

What did you find worked for you in the end?

gdos · June 25, 2019, 5:07pm

Hi, I don’t have access to my files at the moment. Tomorrow morning I’ll have a look and let you know. I remember finding two versions of those and only one worked.

DanL · June 25, 2019, 5:26pm

Thanks, much appreciated!

gdos · June 26, 2019, 9:22am

Ok so, the lstm_wt103.pth and itos_wt103.pkl files I had been using have size 177,091,123 bytes and 1,027,823 bytes respectively. If I’m not mistaken I got them from https://www.kaggle.com/mnpinto/fastai-wt103-1 (at least the size of the files at this link coincide with what I have on my disk). I hope this helps.

DanL · June 26, 2019, 10:17am

Thanks! Those are the same as I had and it turns out the issue was with the breaking changes mentioned here: Major new changes and features

Fixed by:
config = awd_lstm_lm_config.copy()
config['n_hid'] = 1150

and then passing config=config into the language_model_learner parameters.

gdos · June 26, 2019, 10:32am

Ok cool, thanks to you then - I didn’t know about it, I’ll have to fix my own code when I resume working on that.

DanL · June 26, 2019, 3:34pm

No problem. Don’t suppose you remember how you got around the firewall issue with text_classifier_learner (rather than language_model_learner)? It has a pretrained= parameter but no pretrained_fnames= parameter…

If you leave it with the defaults it tries to download the model from aws.

gdos · June 26, 2019, 4:02pm

I’ve looked into my code and it went smoothly when I ran it back then, without specifying any pretrained_fnames parameter. If I’m not mistaken at that stage you don’t need the files anymore, because you should be taking advantage of the model you’ve just fine-tuned (starting from the pretrained one that was obtained via the files).

My code looks like this:

learn = text_classifier_learner(data_clas, AWD_LSTM, pretrained=False, drop_mult=0.5)
learn.load_encoder(‘ft_enc’)

DanL · June 26, 2019, 4:19pm

OK, cool. I was just a bit concerned as I had to specify pretrained=False to get around the AWS download but then wasn’t pointing to anything else.

This is what I’ve got (which does run):

learn = text_classifier_learner(data_clas, AWD_LSTM, drop_mult=0.5, pretrained=False, config=config_clas)

learn.load_encoder('fine_tuned_enc')

gdos · June 26, 2019, 4:22pm

~~Yeah I believe data_clas contains the current fine-tuned model.~~
Forget about it, obviously the fine-tuned model gets loaded into learn by the .load_encoder method.

Anyway, yours is a legitimate doubt since it’s not clear what needs to be retrieved via AWS at this stage. It’s been a while since I played with this and honestly don’t know. Maybe other more experienced users will answer the question…

DanL · June 27, 2019, 7:31am

Thanks, good to know I’m not crazy for questioning this

saansd2003 · November 12, 2020, 3:37pm

Hello, do we have a solution for this problem?

In fastai v2, I am trying to use text_classifier_learner with AWD_LSTM architecture with pretrained weights. However, due to company firewall issue, fastai is not able to download the weights from S3 bucket.

Is there a way that I can download the pretrained weight files manually and point text_classifier_learner to use them?

GeorgeBryan · November 18, 2020, 8:44pm

I appreciate your answers, I also found a solution to my problem, thanks!