Error Intro Chapter 1

rajeevramani · August 27, 2020, 2:12am

when I run this code in https://n5kk1g6b.gradient.paperspace.com/notebooks/course-v4/nbs/01_intro.ipynb

from fastai.text.all import *

dls = TextDataLoaders.from_folder(untar_data(URLs.IMDB), valid='test')
learn = text_classifier_learner(dls, AWD_LSTM, drop_mult=0.5, metrics=accuracy)
learn.fine_tune(4, 1e-2)

I get

FileNotFoundError: [Errno 2] No such file or directory: '/storage/data/imdb_tok/counter.pkl'

Has anyone seen this problem? I checked in the forum, there is a question but I did not see a response. I am using the Gradient Free-P5000

Complete error

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-3-5ab79cd5e866> in <module>
      1 from fastai.text.all import *
      2 
----> 3 dls = TextDataLoaders.from_folder(untar_data(URLs.IMDB), valid='test')
      4 learn = text_classifier_learner(dls, AWD_LSTM, drop_mult=0.5, metrics=accuracy)
      5 learn.fine_tune(4, 1e-2)

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/text/data.py in from_folder(cls, path, train, valid, valid_pct, seed, vocab, text_vocab, is_lm, tok_tfm, seq_len, backwards, **kwargs)
    222         "Create from imagenet style dataset in `path` with `train` and `valid` subfolders (or provide `valid_pct`)"
    223         splitter = GrandparentSplitter(train_name=train, valid_name=valid) if valid_pct is None else RandomSplitter(valid_pct, seed=seed)
--> 224         blocks = [TextBlock.from_folder(path, text_vocab, is_lm, seq_len, backwards) if tok_tfm is None else TextBlock(tok_tfm, text_vocab, is_lm, seq_len, backwards)]
    225         if not is_lm: blocks.append(CategoryBlock(vocab=vocab))
    226         get_items = partial(get_text_files, folders=[train,valid]) if valid_pct is None else get_text_files

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/text/data.py in from_folder(cls, path, vocab, is_lm, seq_len, backwards, min_freq, max_vocab, **kwargs)
    210     def from_folder(cls, path, vocab=None, is_lm=False, seq_len=72, backwards=False, min_freq=3, max_vocab=60000, **kwargs):
    211         "Build a `TextBlock` from a `path`"
--> 212         return cls(Tokenizer.from_folder(path, **kwargs), vocab=vocab, is_lm=is_lm, seq_len=seq_len,
    213                    backwards=backwards, min_freq=min_freq, max_vocab=max_vocab)
    214 

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastai/text/core.py in from_folder(cls, path, tok, rules, **kwargs)
    276         if tok is None: tok = WordTokenizer()
    277         output_dir = tokenize_folder(path, tok=tok, rules=rules, **kwargs)
--> 278         res = cls(tok, counter=(output_dir/fn_counter_pkl).load(),
    279                   lengths=(output_dir/fn_lengths_pkl).load(), rules=rules, mode='folder')
    280         res.path,res.output_dir = path,output_dir

/opt/conda/envs/fastai/lib/python3.8/site-packages/fastcore/utils.py in load(fn)
    522 def load(fn:Path):
    523     "Load a pickle file from a file name or opened file"
--> 524     if not isinstance(fn, io.IOBase): fn = open(fn,'rb')
    525     try: return pickle.load(fn)
    526     finally: fn.close()

FileNotFoundError: [Errno 2] No such file or directory: '/storage/data/imdb_tok/counter.pkl'

SamJoel · August 27, 2020, 10:47am

Try writing the path in separate line and perform commands on the path.

path= untar_data(URLs.IMDB)

and perform cmds like
path.ls()
to check the path of storage and etc.

Malte · August 28, 2020, 10:39am

Did you figure out what caused the error? I am getting exactly the same one, also on different machines.

Tried out what @SamJoel proposed performing path.ls() and it returns an array with 7 Path objects

rajeevramani · August 28, 2020, 11:46am

Yes, I did get it working by going into the terminal in Paperspace and deleting the folder imdb_tok completely.

Malte · August 28, 2020, 1:27pm

Worked a treat. Thanks!!!

D_SM · September 13, 2020, 4:44pm

The issue was like this i have interrupted the cell on running it first time.
so the imdb_tok has been downloaded already.

Hence, to run the cell again you have to delete imdb_tok folder or just rename it , to get the things running.

KevinB · September 13, 2020, 5:11pm

you should also be able to do: path = untar_data(URLs.IMDB, force_download=True)

yurirzhanov · September 13, 2020, 7:17pm

I split cells: run
source = untar_data(URLs.IMDB)
then
print(source)
Output: /storage/data/imdb
The
from fastai.text.all import *
dls = TextDataLoaders.from_folder(source, valid=‘test’, bs=32)
fails with the same error as Rajeev’s:
FileNotFoundError: [Errno 2] No such file or directory: ‘/storage/data/imdb_tok/counter.pkl’
However, running
source.ls()
gives
(#7) [Path(’/storage/data/imdb/README’),Path(’/storage/data/imdb/imdb.vocab’),Path(’/storage/data/imdb/tmp_lm’),Path(’/storage/data/imdb/tmp_clas’),Path(’/storage/data/imdb/unsup’),Path(’/storage/data/imdb/test’),Path(’/storage/data/imdb/train’)]
so there is no imdb_tok directory.
By the way, how can I get terminal in the paperspace?
Thanks in advance.

rajeevramani · September 14, 2020, 7:05am

@yurirzhanov here is the screenshot

yurirzhanov · September 14, 2020, 4:04pm

Thank you. But I work on Paperspace. There is no such a dropdown menu. Anyway, I do not have ‘imdb_tok’

directory that you deleted. Only ‘imdb’.

yurirzhanov · September 14, 2020, 4:41pm

rajeevramani · September 14, 2020, 8:56pm

@yurirzhanov - Can you share the steps for how you open this particular notebook in Paperspace? I am using the same platform. If I know the steps you follow to access your notebook then may be I can help.

yurirzhanov · September 15, 2020, 6:03pm

Here is the sequence of actions:

Log in.
Choose Gradient (not Core)
Choose Jupyter: Run a sample notebook
Choose Paperspace + Fast.ai
Choose fastbook
Open 01_intro
Run
!pip install -Uqq fastbook
import fastbook
fastbook.setup_book()

Run
from fastbook import *

Run
#id first training
Runs fine

Run uploader/custom classification
Runs correctly

Run CAMVID_TINY
Runs fine

Run
print(URLs.IMDB)
Output
https://s3.amazonaws.com/fast-ai-nlp/imdb.tgz

Run
source = untar_data(URLs.IMDB)
print(source)
Output
/storage/data/imdb

Run
from fastai.text.all import *
dls = TextDataLoaders.from_folder(source, valid=‘test’, bs=32)
Error
FileNotFoundError: [Errno 2] No such file or directory: ‘/storage/data/imdb_tok/counter.pkl’

Run
source.ls()
Output
(#7) [Path(’/storage/data/imdb/README’),Path(’/storage/data/imdb/imdb.vocab’),Path(’/storage/data/imdb/tmp_lm’),Path(’/storage/data/imdb/tmp_clas’),Path(’/storage/data/imdb/unsup’),Path(’/storage/data/imdb/test’),Path(’/storage/data/imdb/train’)]

rajeevramani · September 15, 2020, 9:02pm

Here are some screenshots.

Login and click Gradient as you have done. That should bring you to the following screen. Click on the start button

Screen Shot 2020-09-16 at 6.30.23 am3470×1274 350 KB
Once the running sign (in green is on). Click on Open V2 beta

Screen Shot 2020-09-16 at 6.54.06 am3552×1370 411 KB
That should bring you to the following screen

Screen Shot 2020-09-16 at 6.54.55 am3544×1166 311 KB

Now you can follow the instructions I provided earlier to open the terminal by clicking on the ‘New’ dropdown on top right below the log out button

I hope this helps.

yurirzhanov · September 18, 2020, 7:25pm

Thank you so much. It worked (sort of - I run out of CUDA memory, but it’s a different and understandable matter). But just imagine how frustrating it was for me to start repeating exercises and get stuck in intro for unknown reason…

rajeevramani · September 18, 2020, 9:31pm

No problem at all. Glad I could help. I understand the frustration. Hopefully others can find and use this post to get unstuck.

Rajarshi1 · January 13, 2021, 2:37pm

this dint work

Rajarshi1 · January 13, 2021, 4:07pm

hey how do I delete imdb_tok folder?

evanrayr · March 13, 2021, 1:10pm

Hi,

I found that this worked for me: https://stackoverflow.com/questions/303200/how-do-i-remove-delete-a-folder-that-is-not-empty/

Hope this helps, almost lost my mind on this one lol.

hamiasmai · April 15, 2021, 1:34am

works like charm! thanks