Clarification on lesson 4 wikitext pretrained model

I found lesson 4 to be incredibly interesting but there’s something I’m not really sure about when it comes to use the wikiText103 pretrained dataset.

In the notes
we can see how the learner for the language model is initialized this way

learn = language_model_learner(data_lm, pretrained_model=URLs.WT103, drop_mult=0.3)

while in the notebook we find the same line in this form

learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.3)

what is exactly going on?
Are we using the WT103 or not?

I can’t see any file being downloaded and when I try to substitute the AWD_LSTM with pretrained_model=URLs.W103 I get a Key Error

Thanks for stopping by, hopefully somebody could clarify that for me

1 Like

So, if we follow the source code, here we can see that after language_model_learner grabs the model, it looks for the _model_meta for a particular arch. In our case, it’s an AWD_LSTM. If we follow that to the meta, AWD_LSTM has two urls, a fwd and a bwd. Then if we follow the learner more (line 208-216) it will grab either the fwd or bwd pretrained weights. Does this help?

1 Like

Greatly, thank you very much