01_intro - Google Colab Path where are downloaded files are saved

Hi,

I am on the 1st Chapter of the book (Running Jupyter Notebook on Google Colab)

I cannot locate the folder where the image files downloaded are stored in Google Colab. The command is executing successfully, but I wanted to know where Google Colab stores the files downloaded, I tried searching in Files Tab and in my Google Drive but couldn’t locate the same.

Thanks a lot !

CODE:

#id first_training

#caption Results from the first training

from fastai.vision.all import *

path = untar_data(URLs.PETS)/'images'

def is_cat(x): return x[0].isupper()

dls = ImageDataLoaders.from_name_func(

    path, get_image_files(path), valid_pct=0.2, seed=42,

    label_func=is_cat, item_tfms=Resize(224))

learn = cnn_learner(dls, resnet34, metrics=error_rate)

learn.fine_tune(1)
1 Like

See the output of path. it’s pretty buried where it actually is. (it goes to fastai’s hidden directory in the system)

1 Like

Its Output is -
Downloading: “https://download.pytorch.org/models/resnet34-333f7ec4.pth” to /root/.cache/torch/hub/checkpoints/resnet34-333f7ec4.pth

There is no .cache directory visible though.

How long does google store the Images, if it is indeed stored the the cache, it should be deleted when I close my notebook.

If that would be the case, how do I save the Images dataset in my G Drive ?

Thanks

1 Like

I mounted google drive by clicking on the mount drive button. It’s one of the button near the top when you browse the files. It will give a code snippet that you can run in your notebook. When I untar a file, I picked a destination on my google drive. Here’s an example.
from google.colab import drive
drive.mount(’/content/drive’)
path = untar_data(URLs.MNIST_SAMPLE, dest=‘drive/My Drive’)
I felt it ran a lot slower going back and forth from google drive to the server for some operations.

1 Like

where is the dir?

One trick that has helped me a lot when using Colab is to run this snippet in a cell
before doing anything else:

!curl -s https://course19.fast.ai/setup/colab | bash

This creates the data and models folders in your /content directory (your default current dir in colab) which are linked the fastai directories where untar_data and cnn_learner places your data and models after untarring/unzipping… so you can view the data and models directories in the folders sidebar on the left of your colab page…

If you really want to know where these are located they’re in the /root/.fastai/data and /root/.torch/models along with the /root/.fastai/archive which contains the tarred/zipped files which have been downloaded by the untar_data function…

As a side effect it also upgrades the fastai version to the latest…

4 Likes

Hi @sarah_h,

Using the gdrive to store your dataset is usually a lot slower than using the colab vm’s data storage.

I would suggest just using your gdrive to store your zipped dataset instead and move move your zipped file from your gdrive to the /content/data directory and unzip it there (assuming of course that you don’t want to download it again)…

OTOH, when working the fastai course notebooks, its usually faster (and less of a hassle) to just untar the dataset each time. :grinning:

I usually save the models to the gdrive at the end of the session though (especially the ones that take a long time to train – e.g. the NLP models) and continue my work by reloading the models…

Best regards,
Butch

3 Likes

thank you friend

This was great! very helpful. Thanks a lot :pray: