How to download the data from folder into Colab

Hi everyone

I am completely a beginner in this course.

As part of one of the projects which I want to solve a binary classification problem wherein I have saved the data into my PC (windows). Data is structured into two folders - 1) Class 0 pictures and 2) Class 1 pictures.

I want to download these dataset into Colab but when I am running below codes, its not working -

path = Path(‘D/cell_images’)
path.ls()

This is throwing the error -

No such file or directory: ‘D/cell_images’

How can I solve this issue?

I also used following codes-

path = ‘D/cell_images’
path = untar_data(path)

But this is giving me error-

Invalid URL ‘D/cell_images.tgz’: No schema supplied. Perhaps you meant http://D/cell_images.tgz?

I am really confused as to how to download the datasets in Colab…

Another question would be since my dataset is already classified into class 0 and class 1, should I combine these two first and then shuffle it before splitting into train and val sets?

Hi,

You first need to load the google drive (by authenticating to your account) :

from google.colab import drive
drive.mount(’/content/gdrive’)

Then you need to define your path inside your google drive :

location = “/content/gdrive/My Drive”
path = datapath4file(location)

You’ll be able to download the data using the path variable.

1 Like

Thanks Andrei, It worked… but considering the data uploading was too slow and the original data was in Kaggle, I instead chose to work on Kaggle.

1 Like

You can download the Kaggle-Dataset directly to colab via the Kaggle-API (https://www.kaggle.com/docs/api):

!pip install -U -q kaggle
!mkdir -p ~/.kaggle

Upload the kaggle.json (API-Token) from your Computer

from google.colab import files
files.upload()

Move kaggle.json to the correct folder

!cp kaggle.json ~/.kaggle/

Interact with the Kaggle-API (here I download the Santander-Dataset)

!kaggle competitions download -c santander-customer-transaction-prediction
2 Likes

Thanks a lot! this really helped!!