Now, when I try to download my dataset, I am wondering how I have to do this part. I used the fastai documentation (https://docs.fast.ai/data.external https://docs.fast.ai/tutorial.vision). However, none of those commands seem working in my case. I am always getting this error that there is not a directory with this name or length of files giving me zero all the time. Can you please help me which command I have to use to download my dataset when I uploaded a dataset to PaperSpace?
Also, I am always confused about using those datasets that are available by default inside the fastai library. I know that we used it in the first lesson of the course, but still, I think the name of the dataset (the list of the name: https://docs.fast.ai/data.external) I am using with class URLs not working all the time(error: there is not a directory with this name). Can anyone help me what my mistake is?
P.S. Besides, I read the following topics from the forum, however, I think I need more explanations here.
I was faced with a problem to download a dataset in the virtual machine. These are some of the ways-
First of all, we can use the untar_data method of fastai for download as well as untaring the data. It works well for the standard datasets used in the fastai course which are stored in the cloud in gzip format. However, this method cannot be used to download the file from google drive. (At least I couldn’t)
So, the problem boils down to downloading the datasets. We can use wget command to download the dataset. It works very fast, and I have downloaded the BACH test dataset which is of 3GB in less than 5 min. https://zenodo.org/record/3632035/files/ICIAR2018_BACH_Challenge_TestDataset.zip
But this does not work for Google drive shared link directly.
To download from Google drive, we can use the following bash script which uses the curl command-
(NB: Just select this code, copy it and on the terminal paste Shift+Ctrl+V, or right-click and paste option.)
You must replace the file id with the google drive id of the file and the filename is the name of the file which u want to give the file in double-quotes. It must be made sure that the file is shared publicly (must have edit permission) then only it works. I have tested it and it works fine.
After downloading the zip file, you can unzip it with the tar command or method in step 2.
The command wget can also be used to download the Google drive file.
(Files > 100 Mb are large files) Also change docs.google to drive.google
For large files run the following command with necessary changes in FILEID and FILENAME:
wget --load-cookies /tmp/cookies.txt “https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate ‘https://docs.google.com/uc?export=download&id=FILEID’ -O- | sed -rn ‘s/.confirm=([0-9A-Za-z_]+)./\1\n/p’)&id=FILEID” -O FILENAME && rm -rf /tmp/cookies.txt