Hi @dkobran and @tomg . I am using a paid subscription to paper space and I am getting an error when running the following block of code in Notebook 10.
This looks like a question for @sgugger. /storage/data is where the public datasets consumed by your fastai notebooks live. I’m not aware of any dataset called ‘imdb_tok’, just imdb. Sylvain, can you shed some light on this?
Hi, this doesn’t appear to be a Paperspace specific issue. My fastai notebook generated two pkl files without issue.
:/notebooks# ll storage/data/imdb_tok/
total 8040
drwxr-xr-x 7 root root 4096 Mar 18 05:34 ./
drwxr-xr-x 3 root root 4096 May 23 20:19 ../
-rw-r--r-- 1 root root 3649850 Mar 18 05:34 counter.pkl
-rw-r--r-- 1 root root 3070555 Mar 18 05:34 lengths.pkl
drwxr-xr-x 4 root root 4096 Mar 18 05:26 test/
drwxr-xr-x 2 root root 4096 Mar 18 05:26 tmp_clas/
drwxr-xr-x 2 root root 4096 Mar 18 05:26 tmp_lm/
drwxr-xr-x 4 root root 4096 Mar 18 05:26 train/
drwxr-xr-x 2 root root 1478656 Mar 18 05:34 unsup/
To clarify, /storage/data is fully writeable, just the specific preopulated dateset directories are not (for eg. storage/data/imdb). This seems that whatever operation you are running is failing to properly complete – for help with that I ask you direct your question to the fastai course proctors.
I’m not talking about running pip install --upgrade fastai2 in a terminal in Paperspace.
Why? Because the pip version of fastai2 is not regularly updated (which is normal).
In an ubuntu installation, it is easy: git pull in the fastai2 folder
In Paperspace?
Thanks for the tip!
I had earlier tried it before posting on the forums but it gave me the same nginx server error.
It’s weird that there seems to be no way to run tensorboard on paperspace gradient notebooks.
Meanwhile I’m switching to wandb for my logging needs!
I have noticed this as well. It doesn’t seem to make much difference whether it’s a paid or a free runtime, either.
UPDATE: I got a message back from Paperspace support, saying:
" Thanks for contacting Paperspace. It looks like you are currently running the notebook ’ REFERENCENUMBER. ’ Please make sure to store all of your files within the /storage folder. This alone will improve your provisioning times and decrease any chance of errors.
I haven’t fully figured out how to do this yet, or whether it’s even advised given the course’s instructions, but in case this is useful for someone else, I’m posting the message here. I removed the reference number as it was only specific to me.
UPDATE2: what I did was to move the fastbook folder and the course-v4 folder into the storage folder. You can do this with the GUI. This seems to result in significantly faster load times when starting up the server. YMMV.
Further UPDATE: I returned to my Paperspace machine today, and all the folders with all my work were gone. there was no fastbook folder, and no course-v4. I hadn’t done much, so it wasn’t too big a deal, but I think the issue here was the moving things into the storage folder. I’m not sure I’d follow the advice from the support email above any more, given this experience, at least until they explain a bit more what happened.
I have problem with Paperspace. From book page 44 is simple IMDB sample. When text_classifier_learner(…) is run, I get always error. Index out of range. Other small examples from books chapter 1 worked. I have not done any updates or special steps. Just create new notebook (and used Paperspace fastai v4 premade gradient environment).
Should I do some steps after new notebook to get text_classifier_learner to work with IMDB dataset?