What is itos in Language Models

Noob question here. What is itos in fastai language models and how can I generate it from language model the model that I trained

Check this:


1 Like

Thank you

Hii I didn’t get the answer. Can you share exact code code to save itos of a language model?

I ran the following code:
pickle.dump(data.vocab.itos, open(PATH, ‘wb’))

Then I loaded in this way,
learn = language_model_learner(data_lm, AWD_LSTM , pretrained_fnames = (‘model_path.pth’,‘itos_path’), pretrained= False, drop_mult=0.3)

I got the following error:

TypeError Traceback (most recent call last)
1 # Define language model learner
----> 2 learn = language_model_learner(data_lm, AWD_LSTM , pretrained_fnames = (‘fine_tuned_LM_Cap_EA_AD_OI.pth’,‘itos’), pretrained= False, drop_mult=0.3)

/home/AIX_Common/files/opt/anaconda2/envs/ai-gpu/lib/python3.6/site-packages/fastai/text/learner.py in language_model_learner(data, arch, config, drop_mult, pretrained, pretrained_fnames, **learn_kwargs)
215 if pretrained_fnames is not None:
216 fnames = [learn.path/learn.model_dir/f’{fn}.{ext}’ for fn,ext in zip(pretrained_fnames, [‘pth’, ‘pkl’])]
–> 217 learn.load_pretrained(*fnames)
218 learn.freeze()
219 return learn

/home/AIX_Common/files/opt/anaconda2/envs/ai-gpu/lib/python3.6/site-packages/fastai/text/learner.py in load_pretrained(self, wgts_fname, itos_fname, strict)
73 “Load a pretrained model and adapts it to the data vocabulary.”
74 old_itos = pickle.load(open(itos_fname, ‘rb’))
—> 75 old_stoi = {v:k for k,v in enumerate(old_itos)}
76 wgts = torch.load(wgts_fname, map_location=lambda storage, loc: storage)
77 if ‘model’ in wgts: wgts = wgts[‘model’]

TypeError: ‘int’ object is not iterable

Please let me know where I went wrong…

itos is just a list of all words in the vocabulary of a TextDataBunch. If you want to load pretrained model here is an example:

learn_lm = language_model_learner(data_lm, AWD_LSTM, pretrained=False)
learn_lm.load_pretrained(wgts_fname=pretrained_lm_file, itos_fname=pickled_itos_file_name)

but if you want to create a classifier based on language model then go with:

data_class = TextClasDataBunch.from_df(..., vocab=data_lm.vocab)
class_learner = text_classifier_learner(data_class, AWD_LSTM)
class_learner.load_encoder('path_to_lm_encoder') # you can save encoder with 'learn_lm.save_encoder'

Let me know if it helped


Thanks! Got it…

When i try to use this with a pretrained model, i get this error:
'SequentialRNN' object has no attribute 'get'