Platform: Colab ✅

Yes you are right ,thanks!

I retried again today and it worked for me without quotes.

There’s problem in lesson 2 on Colab.

Google drive changes extensions automatically, so instead of bears.txt you get bears.txt.gdoc and it returns filesytem error when open() is called on this file. I don’t know how to make it work, but I made workaround by just specifying image list in variable instead of reading from file and using simple from fastai.vision.data import download_image

Another cool thing for lesson 2 is a tip how to run server straight from colab:

!pip install flask==0.12.2
!pip install flask_ngrok

and then:

from flask import Flask
from flask import request
from flask_ngrok import run_with_ngrok

app = Flask(__name__)
run_with_ngrok(app)  # Start ngrok when app is run

# for / root, return Hello Word
@app.route("/")
def root():
    url = request.method
    return f"Hello World! {url}"
  
app.run()
2 Likes

Just in case others find this useful - I’m running vanilla Jupyter on Colab and this fixes issues with their version of the UI (ImageDeleter, “doc” function etc). Here is the Gist. Open it in Colab, run cells and there will be URL pointing to Jupyter instance running inside Colab.

3 Likes

Hi,

I’m trying to use !cp to copy my saved learner over to gdrive. I have mounted the drive

However the command

!cp content/data/oxford-iiit-pet/images/models/stage-1.pth /content/gdrive/My\ Drive/fastai-v3/data/image/

results in error:

cp: cannot stat ‘content/data/oxford-iiit-pet/images/models/stage-1.pth’: No such file or directory

I am copying the file path directly from the colab file explorer. image

Any help or advice would be greatly appreciated.

Looks like you forgot to reference path from root in the source, i.e forgot the (first) /

!cp /content/data/oxford-iiit-pet/images/models/stage-1.pth /content/gdrive/My\ Drive/fastai-v3/data/image/

should work if the paths are correct

Thank you for your assistance and apologies for the rookie mistake

TIP: In lesson 3 to have kaggle.json you can do:

# ! mv kaggle.json ~/.kaggle/ - instead of this:

# {"username":"myusername","key":"mykey"} - from kagge.json file
! echo '{"username":"myusername","key":"mykey"}' > ~/.kaggle/kaggle.json
! chmod 600 ~/.kaggle/kaggle.json
3 Likes

Hi everyone,

I am following lesson 3 and I am trying to download from kaggle the planet amazon dataset.

First, before I wrote at the beginning of my notebook:

from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)
root_dir = "/content/gdrive/My Drive/"
base_dir = root_dir + 'fastai-v3/'

which in the end is downloading all the file in my google drive.

I installed kaggle API

! pip install kaggle --upgrade

and then I downloaded kaggle.josn and uploaded in the folder: fastai-v3 ( base_dir = root_dir + ‘fastai-v3/’)

I run this cell:
! mkdir -p ~/.kaggle/
! mv kaggle.json ~/.kaggle/

It gave me this error:

mv: cannot stat ‘kaggle.json’: No such file or directory

my questions are two:

  1. I do not want to download this dataset (34 gb) on my google drive, how I can set it download locally on colab or on my computer?
  2. how actually make it download it from kaggle

Thank you very much for your help

3 Likes

Hello :wave:t3:,

By following the steps below, you will download the dataset locally in Colab and then be able to interact with kaggle’s API.

Step 1: To use the Kaggle API, you first need to create a Kaggle account and create a 'New API Token’.

  • The ‘Create New API Token’ button will trigger the download of a file called ‘kaggle.json’, containing your credentials.
  • Store this json file in a folder called kaggle in your Google Drive so that Colab can find your credentials. The path looks like this:
    My Drive --> Colab Notebook --> kaggle --> kaggle.json

Step 2: To enable interactions between Colab and Kaggle’s API, copy paste this into a cell:

from googleapiclient.discovery import build
import io, os
from googleapiclient.http import MediaIoBaseDownload
from google.colab import auth

auth.authenticate_user()

drive_service = build('drive', 'v3')
results = drive_service.files().list(
        q="name = 'kaggle.json'", fields="files(id)").execute()
kaggle_api_key = results.get('files', [])

filename = "/root/.kaggle/kaggle.json"
os.makedirs(os.path.dirname(filename), exist_ok=True)

request = drive_service.files().get_media(fileId=kaggle_api_key[0]['id'])
fh = io.FileIO(filename, 'wb')
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
    status, done = downloader.next_chunk()
    print("Download %d%%." % int(status.progress() * 100))
os.chmod(filename, 600)

Step 3: you can download the data using Kaggle’s API commands like for instance:
!kaggle datasets download -d alexattia/the-simpsons-characters-dataset -f the-simpsons-characters-dataset.zip

If you need additional details, you can look at my blog post where I go through the process.

5 Likes

Thank you,

it did and it worked.

but, if I would like to download on my computer how I should I do? or pick some file from my computer on google colab without upload on drive?

I got the same error, so I updated the numpy version with the code below, and it worked:
!pip install numpy>=1.15.0

From what I could understand, when we use ‘from folder’ option, fastai automatically picks all .jpg files under the ‘Path’ and sub-folders too.

Hi,

Very novice colab user here, I am trying to work on my own dataset with this method.

I have the drive mounted, I am simply trying to generate a new dataset from folders in my drive. Something similar to untar_data([link]) stage onwards. I have tried multiple things and it feels like I am making this a lot more complicated than it needs to be. Any help is appreciated!

Edit2: Yes, yes, I was making life extremely complicated unnecessarily. I can access to the images directly in the tools, no need to load them in a file structure (too much c++ in my head), mounting is sufficient. It simply works, for example:
path = ‘/content/gdrive/My Drive/Colab Notebooks/[my dataset]/’;
path_img = path+’/Images’
fnames = get_image_files(path_img);
etc.

I will keep it here in case someone sees it.

Some further comments on what I have tried:
The link for untar_data is not a true url, the actual link has to have .tgz extension, and this must be omitted in the input string (https://github.com/fastai/fastai/issues/1130#issuecomment-438156385). But the link from google drive does not come in that format, so it gives me the error “not a gzip file”.

When I try the path as ‘http://content/gdrive/My Drive/Colab Notebooks/[my dataset]’, I get the connection time out error:
HTTPConnectionPool(host=‘content’, port=80): Max retries exceeded with url:[url above] Caused by NewConnectionError(’<urllib3.connection.HTTPConnection object at 0x7fb31a0a4198>: Failed to establish a new connection: [Errno -2] Name or service not known’,

I tried generating a file file through the mounted drive with the file id (on google drive) with no success…

I finally tried manually uploading the data files through the UI, and I can see them in the sessions contents, but still unable to read them into a data frame. Honestly I did not pursue this much as it is bad practice (and clearly unpractiacal) to upload the data set manually each time I connect to my colab session.

edit1: (updated error message after fixing the initial typo I had)

1 Like

I can’t get the setup script working:
!curl -s https://course.fast.ai/setup/colab | bash
It just gives out: ‘bash’ is not recognized as an internal or external command,
operable program or batch file.

There were a few questions with the same problem but none of the solutions worked for me. What am I missing?
image

Edit: I realized the error does not occur when running a hosted session, but is present in local sessions. I’m running the Colab local session on a Windows 10 PC. Any ideas?
Edit: Oh wait, so it means I’ve not installed FastAI properly on my PC. I cant seem to get it working even after installing

Hi,

If you look inside the script you will find that this is some simple installation for fastai.

#!/bin/bash
if [ ! -e /content/models ]; then
        mkdir -p /root/.torch/models
        mkdir -p /root/.fastai/data
        ln -s /root/.torch/models /content
        ln -s /root/.fastai/data /content
        rm -rf /content/sample_data/
fi

echo Updating fastai...
pip install fastai --upgrade > /dev/null
echo Done.

On Windows, there are not folders likes /root/ and there is no bash. For me, you have to install fastai on your local machine.

1 Like

In lesson 0, I can’t open the image. Says it doesn’t exist.
Included (base_dir+ image path).
Also, I checked under contents->gdrive->my drive and inside I can’t find fast AI v3 directory anywhere. Everything above it worked fine and I did add the code snippets everything else.

hi
could you please show the steps 2 and 3 with pictures?
I don’t understand how to move the folder to data directory and the worse problem is I don’t know where is data directory at all
many thanks

Hello,

Since the time of my solution, I have come up with something a bit simpler as you will see in the images below. Please note that this method assumes the file (containing the dataset) is zipped and stored on your local computer.

If you are new to Colab, navigate in the different tabs (“Table of contents”, “Code snippets”, “Files”) in the GUI on the left. There, you will find the data folder I have mentioned in my previous post, assuming you have run this line of code at the beginning of the notebook:
!curl https://course.fast.ai/setup/colab | bash


2 Likes

Hi All,

I am trying lesson 6 in colab.
I am trying below command to get the rossmann in path, but I am getting error
path = untar_data(rossmann.tgz);

Please help me

Thanks and Regards,
Subho