TypeError: language_model_learner() missing 1 required positional argument: ‘arch’

Hi,

I am new to this forum and DL. I am trying to run a classification problem using the code from Lesson 4. I get the below error. I know its something with data on files, but do not know how to handle this upfront -

UnicodeDecodeError Traceback (most recent call last)
in ()
4 .filter_by_folder(include=[‘train’, ‘test’])
5 #We may have other temp folders that contain text files so we only keep what’s in train and test
----> 6 .random_split_by_pct(0.1)
7 #We randomly split and keep 10% (10,000 reviews) for validation
8 .label_for_lm()

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py in _inner(*args, **kwargs)
423 self.valid = fv(*args, **kwargs)
424 self.class = LabelLists
–> 425 self.process()
426 return self
427 return _inner

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py in process(self)
470 “Process the inner datasets.”
471 xp,yp = self.get_processors()
–> 472 for ds,n in zip(self.lists, [‘train’,‘valid’,‘test’]): ds.process(xp, yp, name=n)
473 #progress_bar clear the outputs so in some case warnings issued during processing disappear.
474 for ds in self.lists:

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py in process(self, xp, yp, name)
625 p.warns = []
626 self.x,self.y = self.x[~filt],self.y[~filt]
–> 627 self.x.process(xp)
628 return self
629

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py in process(self, processor)
66 if processor is not None: self.processor = processor
67 self.processor = listify(self.processor)
—> 68 for p in self.processor: p.process(self)
69 return self
70

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py in process(self, ds)
36 def init(self, ds:Collection=None): self.ref_ds = ds
37 def process_one(self, item:Any): return item
—> 38 def process(self, ds:Collection): ds.items = array([self.process_one(item) for item in ds.items])
39
40 class ItemList():

/usr/local/lib/python3.6/dist-packages/fastai/data_block.py in (.0)
36 def init(self, ds:Collection=None): self.ref_ds = ds
37 def process_one(self, item:Any): return item
—> 38 def process(self, ds:Collection): ds.items = array([self.process_one(item) for item in ds.items])
39
40 class ItemList():

/usr/local/lib/python3.6/dist-packages/fastai/text/data.py in process_one(self, item)
301 "PreProcessor that opens the filenames and read the texts."
302 def process_one(self,item):
–> 303 return open_text(item) if isinstance(item, Path) else item
304
305 class TextList(ItemList):

/usr/local/lib/python3.6/dist-packages/fastai/text/data.py in open_text(fn, enc)
266 def open_text(fn:PathOrStr, enc=‘utf-8’):
267 “Read the text in fn.”
–> 268 with open(fn,‘r’, encoding = enc) as f: return ‘’.join(f.readlines())
269
270 class Text(ItemBase):

/usr/lib/python3.6/codecs.py in decode(self, input, final)
319 # decode input (taking the buffer into account)
320 data = self.buffer + input
–> 321 (result, consumed) = self._buffer_decode(data, self.errors, final)
322 # keep undecoded input until the next call
323 self.buffer = data[consumed:]

UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xae in position 185: invalid start byte

Hi jainayush007,
I guess the file isn’t encoded in unicode. Jeremy tackles this problem in lesson 5.
You can specify the encoding like this:

movies = pd.read_csv(path/‘u.item’, delimiter=’|’, encoding=‘latin-1’, header=None,
names=[item, ‘title’, ‘date’, ‘N’, ‘url’, *[f’g{i}’ for i in range(19)]])

I hope I could help. This is also my first post :wink:
Best,

C0LIN

Hi Colin,

Thanks for the response. I didn’t intend to read into a csv. The given code directly picks the text files from the folders to the databunch. Is there a way to include encoding =‘latin-1’ in the TextList. syntax itself?

Regards.

And another error-

learn = language_model_learner(data_lm, pretrained_model=URLs.WT103, drop_mult=0.3)

TypeError Traceback (most recent call last)

<ipython-input-26-053ea4e9872b> in <module>() ----> 1 learn = language_model_learner(data_lm, pretrained_model=URLs.WT103, drop_mult=0.3)

TypeError: language_model_learner() missing 1 required positional argument: ‘arch’

Any suggestions please?

@jeremy - Can you please help?

encoding =‘latin-1’ sould work with from_folder as well

Hi, I am facing the same error of
language_model_learner() missing 1 required positional argument: 'arch’
In fact i was able to run the code before with no errors but when I try running the same code this morning, i am getting the above error. Please help.

I assume you have updated fastai since you last ran succesfully.

The API for language_model_learner has changed. Check the docs but you will need something like:

learn = language_model_learner(data_lm, arch=AWD_LSTM, drop_mult=0.3)

Also, if you take a look at the source you will see that the AWD_LSTM model loads pre-trained weights from URLs.WT103_1 as default.

4 Likes

Thanks a lot. This works.

Thank you @jbuzza!

@jbuzza, it looks like arch is now required for text_classifier_learner as well…would you suggest using AWD_LSTM here as well? I’ll add that I’m seeing higher accuracy on the language_model_learner but lower accuracy(about 3% lower) on the text_classifier_learner (where arch = AWD_LSTM) on the dataset I’ve used in the past.

Yes, something like this is required

learn = text_classifier_learner(data, AWD_LSTM, drop_mult=0.5)

I don’t know why you are seeing different accuracy but suggest you make sure you have the very latest fastai release v 1.0.45 as there were some issues when this change was first introduced.

1 Like

‘1.0.45’ is the version I’m running.