Lesson 3 - Opening torrent files for Planets datasets in Colab

Hi everyone,
I have followed the steps laid out in this tutorial to setup my data on Colab. However, I am stuck at opening the data which seems to be in torrent form. It seems that the 7zip file structure shown in tutorial don’t exist anymore and are available in torrent form only. These are the files I have downloaded so far

Kaggle-planet-test-tif.torrent	     test_v2_file_mapping.csv
Kaggle-planet-test-tif.torrent.zip   train_v2.csv
Kaggle-planet-train-tif.torrent.zip  train_v2.csv.zip
sample_submission_v2.csv.zip

I am not sure how to open/unpack the files that are in

Kaggle-planet-test-tif.torrent

and

Kaggle-planet-test-tif.torrent

which has the training and testing files. Any ideas on how to go about it??

1 Like

Hi Sanwal hope your having lots of fun.


You may need to download a torrent downloader like Bitorrent in order to down load the torrent files.

cheers mrfabulous1 :smiley::smiley:

it’s not allowed to download files in colab. Still what you can do is save it to the google drive but the file is of the size 19.4gb rest u decide.

I’m facing the same problem.
However, I’m using GCP, instead of Colab.
I need a way to download the torrent files directly to the cloud, without downloading it to the local disk and then uploading it to the cloud, because my daily internet data limit is 5GB. There must be some command to do that. Can someone please help me out.

You can try aria2, tried it once in colab and it did work.

1 Like

https://colab.research.google.com/drive/1lwODJuLEJAtlaJc_4lSAa9NzgFl-jjCB

This is what I did to get the data from Kaggle.

3 Likes

Was anyone able to formalize a way for getting data out of these torrent files. Or another way to extract the data

Is there a working version for Windows?

Has any one figured out a solution to fetch data from .torrent files?

I have found out that these .torrent files are not valid any more.
I have tried with qbittorrent and transmission-cli tools. They are tools to download torrent files in linux - but no success. I eventually then downloaded separately train-jpeg.tar and uploaded to my gcp instance.

1 Like

Yes. I tried this for 2 days, but eventually gave up… Thanks

somebody rehosted them on kaggle.

!kaggle datasets download nikitarom/planets-dataset -p "{path}"
!unzip -q -n '{path}'/planets-dataset.zip -d '{path}'
3 Likes