I am following Lesson 4 on Language models. I am trying to load a pretainned model of another language from the Model Zoo. In particular the Spanish one, which links to this Google files.
I load my data as usual:
data_lm = (TextList.from_csv(path,'tweets.csv',cols='tweet')
#Inputs: all the text files in path
#We may have other temp folders that contain text files so we only keep what's in train and test
.split_by_rand_pct(0.1)
#We randomly split and keep 10% (10,000 reviews) for validation
.label_for_lm()
#We want to do a language model so we label accordingly
.databunch(bs=bs))
data_lm.show_batch()
and I get this, which seems good:
|dx |text|
|---|---|
|0 |sobre lo que el xxup xxunk sabĂa de los terroristas de xxmaj las xxmaj xxunk revela cĂłmo funciona el poder en xxmaj españa . y sirve para entender ciertos vetos para que nada cambie . Âż xxmaj por quĂ© xxup pp , xxup psoe y xxmaj cs xxunk que el xxmaj congreso xxunk ? Âż xxmaj por quĂ© xxunk hoy ? pic.twitter.com / xxunk xxbos xxmaj si estĂĄs inscrito en|
|1 |la mesa nos acompañaron @g_pisarello , @guillemmartnez , xxunk , @jdsato , @m_corrales _ y @tableroglobal . https : / / www.youtube.com / watch?v = xxunk ⊠xxbos xxmaj disfruta de tu xxunk @pnique que te va a xxunk poco đ pic.twitter.com / xxunk xxbos xxmaj las cifras del paro van xxunk , pero la precariedad laboral continĂșa siendo muy preocupante en xxmaj españa . xxmaj seguiremos trabajando para|
|2 |se puede . xxmaj asĂ lo he dicho en xxmaj valladolid đ đ» pic.twitter.com / xxunk xxbos xxmaj hay tres posibilidades . xxmaj un acuerdo entre las tres derechas . xxmaj un acuerdo entre xxmaj cs y xxup psoe , que xxunk un xxmaj gobierno de derechas . y un xxmaj gobierno progresista al servicio de la gente , que defienda y xxunk los derechos sociales de todos y todas|
Now I create the learner, and I want to use the pretrained model, but I donât know how.
If Is use;
learn = language_model_learner(data_lm,AWD_LSTM)
It works, but Iâm pretty sure itâs loading the default Wikipedia 103.
If I use this:
learn = language_model_learner(data_lm,AWD_LSTM,pretrained_fnames='models/')
it fails:
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-83-88303e2cf6e2> in <module>
----> 1 learn = language_model_learner(data_lm,AWD_LSTM,pretrained_fnames='models/')
/opt/anaconda3/lib/python3.7/site-packages/fastai/text/learner.py in language_model_learner(data, arch, config, drop_mult, pretrained, pretrained_fnames, **learn_kwargs)
215 model_path = untar_data(meta[url] , data=False)
216 fnames = [list(model_path.glob(f'*.{ext}'))[0] for ext in ['pth', 'pkl']]
--> 217 learn.load_pretrained(*fnames)
218 learn.freeze()
219 return learn
/opt/anaconda3/lib/python3.7/site-packages/fastai/text/learner.py in load_pretrained(self, wgts_fname, itos_fname, strict)
72 def load_pretrained(self, wgts_fname:str, itos_fname:str, strict:bool=True):
73 "Load a pretrained model and adapts it to the data vocabulary."
---> 74 old_itos = pickle.load(open(itos_fname, 'rb'))
75 old_stoi = {v:k for k,v in enumerate(old_itos)}
76 wgts = torch.load(wgts_fname, map_location=lambda storage, loc: storage)
FileNotFoundError: [Errno 2] No such file or directory: '~/tweets/models/o.pkl'
but the files are there:
jupyter@my-fastai-instance:~/tweets/models$ ls
itos_pretrained.pkl model-30k-vocab-noqrnn.pth model-eswiki-30k-vocab.pth
(Iâve tried renaming the file to o.pkl
as it asks)
How do you load (and check) a pre-trainned model?
Thanks!