Can you install 7zip in the crestle instance using
sudo apt-get install p7zip-rar
Post that pls try
7za x <filename.tar.7z>
Can you install 7zip in the crestle instance using
sudo apt-get install p7zip-rar
Post that pls try
7za x <filename.tar.7z>
@anurag Need your help here. Looks like in crestle you cannot install kaggle-cli and unzip tar.7z files. There is a error like lxml<4.1,>=4.0.0 distribution is needed. can you help @memetzgz in this as she is using crestle?
Thanks Vijay for your help – yes, this is the error I’m getting
I’ll look into the test data.
Crestle does have the test-jpg
, test-jpg-additional
and test-tif-2
folders with ~40k/20k/61k images respectively. Are you running into issues with using them?
Hi @anurag, problem may be then that I did not create the right symlinks? I will check when I next log on. Appreciate your looking into this on your end. Crestle is working very well otherwise!
It is also possible to download specific file using kaggle-cli
$ kg download -u <username> -p <password> -c <competition> -f train.zip
Thanks for the tip - I didn’t know that
for f in test-jpg-additional.tar.7z test-jpg.tar.7z test_v2_file_mapping.csv.zip train-jpg.tar.7z train_v2.csv.zip
do
kg download -f $f
done
I followed the steps from the first post, and kg download gives me this error.
‘NoneType’ object has no attribute ‘find_all’
I installed the cli using the --upgrade option.
@pnvijay maybe you can include this to the guide as well. I just started with Fast.ai and I was running into disk space errors on Paperspace with this dataset until I went to check the kaggle-cli github for info, this really saved me.
Hi @Priit, Will include.
@jeremy I am not able to edit my top post and include the fact that individual files can also be downloaded via kaggle cli. Can you please help?
Just wanted to mention that Kaggle has finally released the official CLI tool.
Although the detailed instructions are available in GitHub, here is a brief usage cheat sheet:
~/.kaggle/kaggle.json
on the target machine.# Make sure you run this inside a conda environment
pip install kaggle
# Secure the credentials
chmod 600 ~/.kaggle/kaggle.json
# List all files for a competition
kaggle competitions files -c COMPETITION_NAME
# Download a single file to the current directory
kaggle competitions download -c COMPETITION_NAME -f DATASET_FILE -w
Thank you, @pnvijay. Point 3 mentions kg download
which downloads all the files, including the tif ones which are huge. To avoid that, I used kg download -f <filename>
to download specific files.
Hi Abhirammv, Thanks for the feedback. I want to incorporate the changes in my original post but not able to do that currently as I am not able to edit it. It looks there is a edit counter limit. I had edited the post around 3 times and hence it is not allowing me to edit again.
Talking about new API, one could use the following commands to download required data (it seems that Kaggle official API doesn’t support listing file names in single command for now):
COMPETITION=planet-understanding-the-amazon-from-space
DATA=/home/user/data/kaggle/planet # your path to data
kaggle competitions download -c $COMPETITION -f train-jpg.tar.7z -p $DATA
kaggle competitions download -c $COMPETITION -f test-jpg.tar.7z -p $DATA
kaggle competitions download -c $COMPETITION -f test-jpg-additional.tar.7z -p $DATA
kaggle competitions download -c $COMPETITION -f train_v2.csv.zip -p $DATA
kaggle competitions download -c $COMPETITION -f test_v2_file_mapping.csv.zip -p $DATA
kaggle competitions download -c $COMPETITION -f sample_submission_v2.csv.zip -p $DATA
Thank you very much for the info!
How do get the kaggle.json file into ~/.kaggle/kaggle?
With “ls -la” I don’t even see the “.kaggle” folder.
Best regards
Michael
mkdir -p ~/.kaggle
mv path-to-the-downloaded-file ~/.kaggle/kaggle.json