I would like to work with the pretrained AWD-LSTM, to analyze Wikipedia text extracts.
So the weights fitted to wikitext are just what I need.
Reading the docs, it looks like you are supposed to load these models with language_model_learner
. This is the only function that takes an URL for a pretrained model. But it also expects a DataBunch and will change the models embedding to suit that vocabulary.
I don’t want to change the embedding, I just want the pretrained WT103 model, with its wikitext vocabulary. So I read my way through the code for loading a model and came up with this solution to load a WT103 model, without changing anything in it:
# get weights and itos
model_path = untar_data(URLs.WT103, data=False)
fnames = [list(model_path.glob(f'*.{ext}'))[0] for ext in ['pth', 'pkl']]
wgts_fname, itos_fname = fnames
itos = pickle.load(open(itos_fname, 'rb'))
wgts = torch.load(wgts_fname, map_location=lambda storage, loc: storage)
# get parameters for language model
default_dropout = {'language': np.array([0.25, 0.1, 0.2, 0.02, 0.15]),
'classifier': np.array([0.4, 0.5, 0.05, 0.3, 0.4])}
drop_mult = 1.
tie_weights = True
bias = True
qrnn = False
dps = default_dropout['language'] * drop_mult
bptt = 70
vocab_size = len(itos)
emb_sz = 400
nh = 1150
nl = 3
pad_token = 1
drop_mult = 1.
model = get_language_model(vocab_size, emb_sz, nh, nl, pad_token, input_p=dps[0], output_p=dps[1],
weight_p=dps[2], embed_p=dps[3], hidden_p=dps[4], tie_weights=tie_weights, bias=bias,
qrnn=qrnn)
# load weights into model
model.load_state_dict(wgts)
Is this the way to do it ?
It seems strange that loading a model together with a dataset is a one-liner. But if you want just the model you have to dig in the code.