I was running lesson3 imdb example and the line
data_lm = TextLMDataBunch.load(path, 'tmp_lm', bs=bs)
NotADirectoryError: [Errno 20] Not a directory: '/home/junlin/.fastai/data/imdb/tmp_lm/itos.pkl'.
Minimal example that reproduces this:
from fastai.text import *
path = untar_data(URLs.IMDB)
data_lm = (TextList.from_folder(path)
#Inputs: all the text files in path
.filter_by_folder(include=['train', 'test', 'unsup'])
#We may have other temp folders that contain text files so we only keep what's in train and test
#We randomly split and keep 10% (10,000 reviews) for validation
#We want to do a language model so we label accordingly
data_lm = TextLMDataBunch.load(path, 'tmp_lm', bs=48)
I ran into this, too. The solution is buried in the Traceback (see line 167 above). For some reason, the
DeprecationWarning isn’t being thrown at the top of the Traceback, where it should be (and would be more visible).
from fastai import *
data_lm = load_data(path, bs=bs)
load_data intended to be a
DataBunch? It seems to work just fine as a standalone, but might be more clear as a
DataBunch. I’d be happy to submit an Issue and/or PR on GitHub, if needed…just thought I’d ask here first.
It worked! Thank you! BTW, I also found this example in the docs showing what you suggested.
The example you linked to suggests that
load_data is intended to be a standalone function of the
basic_data module rather than a
Had to do
data_lm = load_data(path, fname=‘tmp_lm’, bs=bs)
with the latest version
Also had to download manually the full imdb data set from
does not include the “.tgz”
I wonder why the download did not work.
It is true that the URL is missing the file extension (.tgz), but that is not a bug (they probably add it at a later point). In fact, I have worked with someone else’s trained model and to pass it to Fastai I uploaded the files to Amazon S3 just like the IMDB file you mention, and in the URL I had to omit the .tgz extension so it would work!