Platform: Colab ✅

Hi Vikrant,

Heres the code also for untar_data
https://docs.fast.ai/datasets.html#untar_data

accessible via doc(untar_data) in a traditional jupyter install of fast_ai (not colab)

Have you tried the rest of the code at the bottom of the collab config page
https://course.fast.ai/start_colab.html#step-4-saving-your-data-files

path = Path(base_dir + 'data/pets')
dest = path/folder
dest.mkdir(parents=True, exist_ok=True)

the Path command makes a special path object where you can get access to folder methods

does this help?

Hi,
How do I add dir structures in colab and upload the text files for the images as suggested in lesson 2?

Hi, @kadlugan

I have a similar problem, I followed the tutorial:
https://course.fast.ai/start_colab.html#step-4-saving-your-data-files

I add this at the beginning:

%reload_ext autoreload
%autoreload 2
%matplotlib inline

from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)
root_dir = "/content/gdrive/My Drive/"
base_dir = root_dir + 'fastai-v3/'

and I imported:

from fastai.vision import *
from fastai.metrics import error_rate
bs = 64

then following the tutorial and your post I added:

path = untar_data(URLs.PETS); path
path = Path(base_dir + 'data/pets')
dest = path/folder
dest.mkdir(parents=True, exist_ok=True)

and received this error:
---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

<ipython-input-8-41891b907d8d> in <module>()
      1 path = Path(base_dir + 'data/pets')
----> 2 dest = path/folder
      3 dest.mkdir(parents=True, exist_ok=True)
      4 path = untar_data(URLs.PETS); path

NameError: name 'folder' is not defined

I made in drive a folder named: fastai-v3. I am not well understanding where untar_data is saving the files.
How to make it saving in the google drive?
or in a folder in the pc?

Thank you much for your helps

dest = path/folder should be dest = path/'folder', if folder is a string or if not it needs to be defined.

@salvatore.r @vikbehal hope @gamo fix worked for you. I found that my overload learn.save/load code that i posted a few posts above helped me split up my model saves - Whilst retaining the fastai colab free functionality

If you are using standard data sets, the notebooks and saved weights are usually enough to learn and progress, and you can to keep a gdrive copy.
Saved weights can be in the 250Mb range for image recognition

If you are creating your own datasets then having all of the folder (images weights notebooks ) on your gdrive is a good option. Though I am not sure if you need to shift your image sets to the colab instance for performance. Anyone care to comment?

As you get more sophisticated you will see that practitioners begin using paid cloud services like AWS ec2/s3 for storage of models. or you build your own DL computer and have the datasets on it.

I am not sure of your tech experience (high | low) so I hope this advice hits the right level for you :slight_smile:

Thank you @kadlugan!

I did try but above will just create the directory system. How do I tell fast.ai that mentioned path is my path where data and models will be saved?

I followed this post to change the configuration. It works if I run notebook as-is. As soon as I change the runtime to GPU, it goes back to default fast.ai path.

Any guidance will be greatly appreicated.

Gabriel, thank you. I did that. Now how do I tell fast.ai to download data at that path? Also, what changes should I make so that model data is saved in Google drive - i.e. the path.

Did you rerun the notebook from start after you switched to GPU?

Yup! Did it work for you?

Having fastai work directly with gdrive is probably not a good idea, you have to treat data on gdrive as on a NAS or other remote storage, if you try to run data directly off gdrive it will have to move that data over network and it will be slow.

Keep all data and models local on colab while you are working and then use !cp or python specific library to copy data and/or models to gdrive. It is best to create a function (def) for this that way you can include it in your learner and have it save the model to gdrive during learning if learning has to run for a long time giving you running backups of your model.

1 Like

@gamo @kadlugan

Thank you for your help, now I understood the general concept behind.

%reload_ext autoreload
%autoreload 2
%matplotlib inline

from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)
root_dir = "/content/gdrive/My Drive/"
base_dir = root_dir + 'fastai-v3/'

from fastai.vision import *
from fastai.metrics import error_rate
bs = 64

path = untar_data(URLs.PETS); path
path = Path(base_dir + 'data/pets')
dest = path/"folder"
dest.mkdir(parents=True, exist_ok=True)

however doing this what it is actually happening is that is creating a folder in my google drive, but for some reasons I am missing it is not downloading inside this folder. So it is creating an empty folder in the fastai folder on my drive.
In my opinion I think what is happening is that is creating a new path but without downloading the data in it.

It is not immediately obvious to me how you are trying to make the code you wrote do that. The untar_data can be given arguments for paths to both the download and the untared data. See https://github.com/fastai/fastai/blob/master/fastai/datasets.py#L151

If you want to untar your data directly to gdrive you should do something like:

gd_dir = Path('/content/gdrive/My Drive/fastai-v3/pets/')
path = untar_data(URLs.PETS, dest=gd_dir)

As I said in a previous post, if you are just saving the data to gd then that is ok, but if you will then use the data on colab, then colab will have to get that data back from gd over network before it can be used and that is slow.

Instead download all data and models locally to your colab instance and use them there, then when you want to save/backup your work copy the whole project folder to gd.

2 Likes

Hi all,

As of 30 minutes ago, I have not been able to use the Fastai library. Despite following the setup guide, I get the error below when I run: from fastai.vision import *

VersionConflict: (fastprogress 0.1.18 (/usr/local/lib/python3.6/dist-packages), Requirement.parse('fastprogress&gt;=0.1.19'))

Is there anyone also experiencing this issue ?

even im experiencing the same problem, did you find the solution to it?

I’ve been looking for some solution but have not found one yet :cry:

I have also tried manually uninstalling (version 0.1.18) /installing (v- 0.1.19) the fastprogress package but it didn’t work either.

Yes, I am getting the same error regarding the version of fastprogress.

same thing here. I tried updating fastprogress to 0.1.19, but it didn’t help.

fastprogress 0.1.19 is now included in the dist. Run the install script again and it should be ok, if not restart your instance and run it.

edit: not my doing, someone in dev fixed it

2 Likes

Hello. I am having an issue that could be related to Google Colaboratory. Any help would be appreciated. Thanks.
https://forums.fast.ai/t/fastai-imagedownloader-widget-class-chromedriver-issue-lesson-1-2-downloader

!cp train_v2.csv content/gdrive/My Drive/fastai-v3/data/planet

whenever i try to copy a file to the my drive the following error occurs

cp: target ‘Drive/fastai-v3/data/planet’ is not a directory

how to resolve this issue