Hello everyone,
I’ve been playing around with fastai for text applications and was wanting to try training a language model from scratch using my own tokenized text. I have a LMDataLoader that I’ve instantiated using my list of numericalized texts, however when I attempt to create a Learner
I am getting the following error
learn = language_model_learner(dl, AWD_LSTM)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/fastai/text/learner.py", line 194, in language_model_learner
vocab = _get_text_vocab(dls)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/fastai/text/learner.py", line 186, in _get_text_vocab
vocab = dls.vocab
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/fastcore/basics.py", line 378, in __getattr__
if attr is not None: return getattr(attr,k)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/fastcore/basics.py", line 378, in __getattr__
if attr is not None: return getattr(attr,k)
AttributeError: 'list' object has no attribute 'vocab'
# ex: "words" is list of numericalized texts:
# [[287, 85, 66, 1, 287, 36...], [ 72, 287, 152, 46, 6...],...]
#
bs,sl = 4,50
ints = L(*words).map(tensor)
dl = LMDataLoader(ints_l, bs=bs, seq_len=sl, shuffle=True)
learn = language_model_learner(dl, AWD_LSTM) # error occurs on creation
Do I need to manually pass a vocabulary to a Learner
or DataLoader
? Apologies if I’ve missed something obvious I am grokking much of this library as I go along with the course.