I am completely a beginner in this course.
As part of one of the projects which I want to solve a binary classification problem wherein I have saved the data into my PC (windows). Data is structured into two folders - 1) Class 0 pictures and 2) Class 1 pictures.
I want to download these dataset into Colab but when I am running below codes, its not working -
path = Path(‘D/cell_images’)
This is throwing the error -
No such file or directory: ‘D/cell_images’
How can I solve this issue?
I also used following codes-
path = ‘D/cell_images’
path = untar_data(path)
But this is giving me error-
Invalid URL ‘D/cell_images.tgz’: No schema supplied. Perhaps you meant http://D/cell_images.tgz?
I am really confused as to how to download the datasets in Colab…
Another question would be since my dataset is already classified into class 0 and class 1, should I combine these two first and then shuffle it before splitting into train and val sets?
You first need to load the google drive (by authenticating to your account) :
from google.colab import drive
Then you need to define your path inside your google drive :
location = “/content/gdrive/My Drive”
path = datapath4file(location)
You’ll be able to download the data using the path variable.
Thanks Andrei, It worked… but considering the data uploading was too slow and the original data was in Kaggle, I instead chose to work on Kaggle.
You can download the Kaggle-Dataset directly to colab via the Kaggle-API (https://www.kaggle.com/docs/api):
!pip install -U -q kaggle
!mkdir -p ~/.kaggle
Upload the kaggle.json (API-Token) from your Computer
from google.colab import files
Move kaggle.json to the correct folder
!cp kaggle.json ~/.kaggle/
Interact with the Kaggle-API (here I download the Santander-Dataset)
!kaggle competitions download -c santander-customer-transaction-prediction
Thanks a lot! this really helped!!