Part 1 NLP

Hello All,

I’m trying to build a nlp model in google colab using fastai, but I’m getting an error while building a text classifier.

Below is the error description:
TypeError Traceback (most recent call last)
in ()
1
----> 2 learn_c = text_classifier_learner(data_clas, AWD_LSTM, drop_mult=0.5, metrics=[accuracy,f1]).to_fp16()
3 learn_c.load_encoder(f’{lang}fine_tuned_enc’)
4 learn_c.freeze()

9 frames
/usr/lib/python3.6/pathlib.py in _parse_args(cls, args)
638 parts += a._parts
639 else:
–> 640 a = os.fspath(a)
641 if isinstance(a, str):
642 # Force-cast str subclasses to str (issue #21127)

TypeError: expected str, bytes or os.PathLike object, not NoneType

‘’‘learn_c = text_classifier_learner(data_clas, AWD_LSTM, drop_mult=0.5, metrics=[accuracy,f1]).to_fp16()
learn_c.load_encoder(f’{lang}fine_tuned_enc’)
learn_c.freeze()’’’

It would be a great help if anyone can help me out with solving this particular error.

Regards,
Rahul Ramchandra Uppari.

Its mostly an issue with your data. Try seeing what data_clas contains. Is it a databunch(which is should be), or something else? Is it empty (None)?
If there is an issue, check your data path, and if you’ve imported it properly.

Many thanks Palaash for your help,

I just added pretrained = False it worked, but still i will try to look into my data_clas and try to work with pretrained = True

Once again thank you for your

Oh, is that so? Pretrained value has nothing to do with the data.
Data and architecture are two completely different things. So in that case, the issue may be somewhere else. Can you please show us the complete error? (all the 9 frames, that is)?

Hello Palaash,

Below is the error details

TypeError Traceback (most recent call last)
in ()
1
----> 2 learn_c = text_classifier_learner(data_clas, AWD_LSTM, drop_mult=0.5,metrics=[accuracy,f1]).to_fp32()
3 learn_c.load_encoder(f’{lang}fine_tuned_enc’)
4 learn_c.freeze()

9 frames
/usr/local/lib/python3.6/dist-packages/fastai/text/learner.py in text_classifier_learner(data, arch, bptt, max_len, config, pretrained, drop_mult, lin_ftrs, ps, **learn_kwargs)
299 warn(“There are no pretrained weights for that architecture yet!”)
300 return learn
–> 301 model_path = untar_data(meta[‘url’], data=False)
302 fnames = [list(model_path.glob(f’*.{ext}’))[0] for ext in [‘pth’, ‘pkl’]]
303 learn = learn.load_pretrained(*fnames, strict=False)

/usr/local/lib/python3.6/dist-packages/fastai/datasets.py in untar_data(url, fname, dest, data, force_download, verbose)
224 def untar_data(url:str, fname:PathOrStr=None, dest:PathOrStr=None, data=True, force_download=False, verbose=False) -> Path:
225 “Download url to fname if dest doesn’t exist, and un-tgz to folder dest.”
–> 226 dest = url2path(url, data) if dest is None else Path(dest)/url2name(url)
227 fname = Path(ifnone(fname, _url2tgz(url, data)))
228 if force_download or (fname.exists() and url in _checks and _check_file(fname) != _checks[url]):

/usr/local/lib/python3.6/dist-packages/fastai/datasets.py in url2path(url, data, ext)
189 “Change url to a path.”
190 name = url2name(url)
–> 191 return datapath4file(name, ext=ext, archive=False) if data else modelpath4file(name, ext=ext)
192
193 def _url2tgz(url, data=True, ext:str=’.tgz’):

/usr/local/lib/python3.6/dist-packages/fastai/datasets.py in modelpath4file(filename, ext)
198 local_path = URLs.LOCAL_PATH/‘models’/filename
199 if local_path.exists() or local_path.with_suffix(ext).exists(): return local_path
–> 200 else: return Config.model_path()/filename
201
202 def datapath4file(filename:str, ext:str=’.tgz’, archive=True):

/usr/local/lib/python3.6/dist-packages/fastai/datasets.py in model_path(cls)
162 def model_path(cls):
163 “Get the path to fastai pretrained models in the config file.”
–> 164 return cls.get_path(‘model_path’)
165
166 @classmethod

/usr/local/lib/python3.6/dist-packages/fastai/datasets.py in get_path(cls, path)
147 def get_path(cls, path):
148 “Get the path in the config file.”
–> 149 return _expand_path(cls.get_key(path))
150
151 @classmethod

/usr/local/lib/python3.6/dist-packages/fastai/datasets.py in _expand_path(fpath)
182 yaml.dump(cls.DEFAULT_CONFIG, yaml_file, default_flow_style=False)
183
–> 184 def _expand_path(fpath): return Path(fpath).expanduser()
185 def url2name(url): return url.split(’/’)[-1]
186

/usr/lib/python3.6/pathlib.py in new(cls, *args, **kwargs)
999 if cls is Path:
1000 cls = WindowsPath if os.name == ‘nt’ else PosixPath
-> 1001 self = cls._from_parts(args, init=False)
1002 if not self._flavour.is_supported:
1003 raise NotImplementedError(“cannot instantiate %r on your system”

/usr/lib/python3.6/pathlib.py in _from_parts(cls, args, init)
654 # right flavour.
655 self = object.new(cls)
–> 656 drv, root, parts = self._parse_args(args)
657 self._drv = drv
658 self._root = root

/usr/lib/python3.6/pathlib.py in _parse_args(cls, args)
638 parts += a._parts
639 else:
–> 640 a = os.fspath(a)
641 if isinstance(a, str):
642 # Force-cast str subclasses to str (issue #21127)

TypeError: expected str, bytes or os.PathLike object, not NoneType

I’m actually confused myself in the definition of untar_data.

def untar_data(url:str, fname:PathOrStr=None, dest:PathOrStr=None, data=True, force_download=False, verbose=False) -> Path:
    "Download `url` to `fname` if `dest` doesn't exist, and un-tgz to folder `dest`."

    **dest = url2path(url, data) if dest is None else Path(dest)/url2name(url)**

    fname = Path(ifnone(fname, _url2tgz(url, data)))
    if force_download or (fname.exists() and url in _checks and _check_file(fname) != _checks[url]):
        print(f"A new version of the {'dataset' if data else 'model'} is available.")
        if fname.exists(): os.remove(fname)
        if dest.exists(): shutil.rmtree(dest)
    if not dest.exists():
        fname = download_data(url, fname=fname, data=data)
        if url in _checks:
            assert _check_file(fname) == _checks[url], f"Downloaded file {fname} does not match checksum expected! Remove that file from {Config().data_archive_path()} and try your code again."
        if verbose: print('.tgz file downloaded. Extracting the contents...')
        tarfile.open(fname, 'r:gz').extractall(dest.parent)
        if verbose: print('File extracted successfully.')
    return dest

the line:
dest = url2path(url, data) if dest is None else Path(dest)/url2name(url) should give an error when dest is not given as an argument (hence dest=None , ie Path(None) gives an error). But untar_data works usually, without having to enter dest. Don’t know how that happens. I’ll probably start a new topic for this issue!

This line actually takes care of the case when dest is not specified. Only if you provide dest (i.e. dest != None) the else statement Path(dest)/url2name(url) is executed. Otherwise (i.e. dest == None) the function url2path is called, which gets dest for you somehow from the URL. It calls url2name, datapath4file or modelpath4file. It’s quite confusing, as the #TODO comment suggests :slight_smile:

#TODO: simplify this mess
def url2path(url, data=True, ext:str='.tgz'):
    "Change `url` to a path."
    name = url2name(url)
    return datapath4file(name, ext=ext, archive=False) if data else modelpath4file(name, ext=ext)

@stefan-ai
I just missed the is None. What a silly mistake! :stuck_out_tongue:
Thanks a lot!
Yes, its very confusing, I can’t seem to figure out whats causing the error for @rahuluppari .
Can you see something that I cant?

5 frames
/usr/local/lib/python3.6/dist-packages/fastai/datasets.py in model_path(cls)
162 def model_path(cls):
163 “Get the path to fastai pretrained models in the config file.”
–> 164 return cls.get_path(‘model_path’)
165
166 @classmethod

/usr/local/lib/python3.6/dist-packages/fastai/datasets.py in get_path(cls, path)
147 def get_path(cls, path):
148 “Get the path in the config file.”
–> 149 return _expand_path(cls.get_key(path))
150
151 @classmethod

/usr/local/lib/python3.6/dist-packages/fastai/datasets.py in _expand_path(fpath)
182 yaml.dump(cls.DEFAULT_CONFIG, yaml_file, default_flow_style=False)
183
–> 184 def _expand_path(fpath): return Path(fpath).expanduser()
185 def url2name(url): return url.split(’/’)[-1]
186

/usr/lib/python3.6/pathlib.py in new(cls, *args, **kwargs)
999 if cls is Path:
1000 cls = WindowsPath if os.name == ‘nt’ else PosixPath
-> 1001 self = cls._from_parts(args, init=False)
1002 if not self._flavour.is_supported:
1003 raise NotImplementedError(“cannot instantiate %r on your system”

/usr/lib/python3.6/pathlib.py in _from_parts(cls, args, init)
654 # right flavour.
655 self = object.new(cls)
–> 656 drv, root, parts = self._parse_args(args)
657 self._drv = drv
658 self._root = root

/usr/lib/python3.6/pathlib.py in _parse_args(cls, args)
638 parts += a._parts
639 else:
–> 640 a = os.fspath(a)
641 if isinstance(a, str):
642 # Force-cast str subclasses to str (issue #21127)

TypeError: expected str, bytes or os.PathLike object, not NoneType

and I forget to inform you I’m running this code on Google Colab and the same code if i execute on Jupyter notebook it is working fine. Reason i Started working on Colab because in my jupyter notebook the Cuda memory is not sufficient to run the code.

I also don’t see what causes the error.

@rahuluppari: Could you share your code how you use untar_data and create data_clas?

That’s strange. When you say Jupyter, do you mean your local installation or another cloud platform? Have you tried paperspace?