Finally I found the solution. It’s found the keyword ‘path’ at load_data() function cannot be changed. I need to run as below. I also give the context of all of my scripts at this section so that other people may need it.
For anyone finding this thread down the road, “fname=” didn’t seem to work for me anymore, but just passing the filename as the second argument (after path) should still work.
Tip for others who may be running into this. You do need to check that your fastai version is up to date, e.g. use pip list | grep fastai to list the version you have installed. In my case it was outdated, and I had to run pip install --upgrade fastai to get the latest version.
If the load() method of TextDataBunch is deprecated, and TextClasDataBunch is a subclass of TextDataBunch,TextClasDataBunch.load(path)should also be deprecated,and it turns out right.
if I run TextClasDataBunch.load(path),it will also cause error “FileNotFoundError: [Errno 2] No such file or directory: ‘/root/.fastai/data/imdb_sample/tmp/itos.pkl’”
I wanna ask if there is also a substitute for “TextClasDataBunch.load(path)”
Using fastai version 1.0.60 and I’m still getting the same itos.pkl FileNotFoundError. Here’s the complete error message:
$ python train.py
This is a extremely well-made film. The acting, script and camera-work are all first-rate....
Traceback (most recent call last):
File "train.py", line 17, in <module>
data = TextClasDataBunch.load(path)
File "/home/cosimo/src/fastai/fastai/.venv/lib/python3.7/site-packages/fastai/text/data.py", line 170, in load
vocab = Vocab(pickle.load(open(cache_path/'itos.pkl','rb')))
FileNotFoundError: [Errno 2] No such file or directory: '/home/cosimo/.fastai/data/imdb_sample/tmp/itos.pkl'
and here’s the code I’m running:
$ cat train.py
#!/usr/bin/env python
from fastai import *
from fastai.text import *
path = untar_data(URLs.IMDB_SAMPLE)
path.ls()
df = pd.read_csv(path/'texts.csv')
df.head()
print(df['text'][1])
# I already saved the model, so no need to do that again
#data_lm = TextClasDataBunch.from_csv(path, 'texts.csv')
#data_lm.save()
# This line fails because the int-to-string vocabulary (itos.pkl) does not exist on disk
data = TextClasDataBunch.load(path)
data.show_batch()
I tried to track down this error, and I believe it is caused by a bunch of files missing from the imdb-sample dataset (?) or I need to download another dataset which is the actual pre-trained language model (?).
I see files in courses/dl2/imdb_scripts/* that look like scripts used to generate the pre-trained model. I tried to use them, but they in turn require other files that I haven’t found how to generate.
Will try to figure this out further, but it might be easier to just grab the “pre-trained model” files from some notebook directory, if I understood correctly.
Meanwhile, if anybody has ideas here, by all means let me know. Thanks!
. I have created a custom databunch which I am trying to load using load_data. But I am getting an attribute error -
File “/home/views.py”, line 641, in get
path, r"/home/data_save.pkl")
File “/usr/local/lib/python3.7/site-packages/fastai/basic_data.py”, line 281, in load_data
ll = torch.load(source, map_location=‘cpu’) if defaults.device == torch.device(‘cpu’) else torch.load(source)
File “/usr/local/lib/python3.7/site-packages/torch/serialization.py”, line 529, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File “/usr/local/lib/python3.7/site-packages/torch/serialization.py”, line 702, in _legacy_load
result = unpickler.load()
AttributeError: Can’t get attribute ‘RobertaTextList’ on <module ’ main ’ from ‘manage.py’>
The RobertaTextList has been defined in the program but I am still getting the error.
Maybe I have to define this function or import it in the context that I’m loading the databunch. But I don’t know how.
Quick tip: I went to terminal and checked the files/folders in path ‘/Users/kopal/.fastai/data/imdb_sample/’ and I found ‘data_save.pkl’ and ‘texts.csv’. load_data worked with .pkl file and I was good to go.