Lesson 3 - Can't Download Planet Data Images Tar Archive

I am unable to copy the link to the download button.
Can anyone please share the same?

Thank you!

@Sunit, in your browser, open the inspect panel on the kaggle page where the dataset resides(in google chrome you have to right-click and choose inspect) then click on the network tab. then hit the download button of the dataset you want to download. you will see URLs under the network tab. click on none that has the name starting with train-jpg.tar and on the right side you should see a tab for header. click on it and copy the url.
Remember to paste your url between “”(double quotes) when pasting the url after wget --load-cookies cookies.txt.

Hope that works for you.

2 Likes

Thanks!

[quote=“elie, post:44, topic:60309”]
click on none that has the name starting with train-jpg.tar and on the right side you should see a tab for header.
[/quote]Do you mean click on one that has the name or did you mean click on none. Could someone explain “right side you should see a tab for header”?

Thanks for a working solution.

Slight problem with this on Paperspace(free account) is that, after downloading all the files, it cannot be unzipped as you run out of disk space while unzipping the contents of the folder.

Very similar to a post above, i have followed the following steps to download the data from kaggle planets data.

Checking the files available for download

! kaggle competitions files planet-understanding-the-amazon-from-space

Using Curl to download relevant files

  • Go to the competition page
  • Press Ctrl+Shift+i, go to the Network tab
  • Click on the folder train-jpg.tar and start downloading. Cancel the download once the download begins.
  • You will notice under the Network tab train-jpg.tar.7z?. Right click on it and copy as cURL (bash)/(cmd) depending on your native OS.
  • cd into the relevant directory where you would want to store the file and paste the cURL.
  • At the end of the command type “-o {desired filename with the extension}”. For ex: train-jpg.tar.7z & train_v2.csv.zip for this project.

Curl command for reference

curl 'https://storage.googleapis.com/kaggle-competitions-data/kaggle-v2/6322/868312/upload/train_v2.csv.zip?GoogleAccessId=web-data@kaggle-161607.iam.gserviceaccount.com&Expires=1595223509&Signature=Jo%2FyMZoXypD3IC5xbsr%2B8YgVdvYU%2FA1qhe2mKTi%2BFh%2FS3c4PbvxEf9lJBIJBeWiWm896gt654z4iKJ3jVtHt9Cgrt81vHo9RH5vLl0Bv4EB2E8dXq1LkpsT6vOVN8tnU55453MIeZqtqhd%2Fm1RKXHdbiJZt9jRtICJLTDnhAoBn8kpADGAV9rgNmLTA2CH6Nu5TI0429cxcEQ15nEp7NIySyqxpSd6%2B7FYoNKdJvY0SFjG0y8h0RNH%2B4BWtdTc1Tzz%2BjTSM0MpP%2FCGhKNN1VCTN9z8bZatyNIoa1xwKPnmb16zu0RJZNp%2FVZLdapBn8DnQKWd691G0xmch1PZ45MlA%3D%3D&response-content-disposition=attachment%3B+filename%3Dtrain_v2.csv.zip' \ -H 'authority: storage.googleapis.com' \ -H 'upgrade-insecure-requests: 1' \ -H 'dnt: 1' \ -H 'user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36 Edg/83.0.478.64' \ -H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \ -H 'sec-fetch-site: cross-site' \ -H 'sec-fetch-mode: navigate' \ -H 'sec-fetch-user: ?1' \ -H 'sec-fetch-dest: document' \ -H 'referer: https://www.kaggle.com/' \ -H 'accept-language: en-US,en;q=0.9' \ --compressed> > > > > -o train_v2.csv.zip

thanks for your solution, it is working for me as well! I spent many hours before finding your solution, it was very frustrating, I do not know why the standard notebook does not work.

Thanks!

@cdaigneault
Glad I could help. Actually the original kaggle dataset was removed from the site, hence this issue!