[SOLVED] Lesson3-planet

This is the notebook I’m referencing: https://nbviewer.jupyter.org/github/fastai/course-v3/blob/master/nbs/dl1/lesson3-planet.ipynb

Having issues with one line of code in this notebook.

7za -bd -y -so x {path}/train-jpg.tar.7z | tar xf - -C {path.as_posix()}

I’m getting the following error when I try to run this in my console:

-bash: syntax error near unexpected token `('

Am I meant to run that code as-is?

Are you running it in the terminal or in the notebook? {path.as_posix()} is python code and won’t run in the terminal.

I should clarify that I’m running MacOS and using Google Cloud. I am able to run Python code in terminal, but I tried this in both terminal and in the notebook and got errors in both.

Is your {path} variable good?
Maybe you can replace the {path} by the actual directory.

Interesting you say that! I will back up a couple steps. I ran this code:

path = Config.data_path()/'planet'
path.mkdir(parents=True, exist_ok=True)

And I got the right output, so it’s working.

But I got errors with the following lines of code:

kaggle competitions download -c planet-understanding-the-amazon-from-space -f train-jpg.tar.7z -p {path}  
kaggle competitions download -c planet-understanding-the-amazon-from-space -f train_v2.csv -p {path}  
unzip -q -n {path}/train_v2.csv.zip -d {path}

So I got them working by doing what you suggested, replacing {path} with /home/jupyter/.fastai/data/planet

I was trying to use the same fix with:
7za -bd -y -so x {path}/train-jpg.tar.7z | tar xf - -C {path.as_posix()}

But I’m not sure how to rewrite the line of code? How do I specifically rewrite {path.as_posix()} to get it to work?

Of course, I also want to figure out why {path} isn’t working for me.

It looks like {path} is just placeholder and I need to paste my path in when I see that. If that’s the case, it should at least be mentioned in the notebook for those less experienced with Python and coding in general. Also, if {path.as_posix()} is just placeholder, that’s even more mystifying.

Personally, I solved this issue by navigating to the directory where the .tar.7z tar file lives and then running these two commands:

7za x myfile.tar.7z
tar -xvf myfile.tar
1 Like

I am using VM on Google cloud.Running everything in Jupyter Notebooks didn’t help.Note that you will need to update to the latest versions of specified libraries.

So as Jeremy suggested, go to restarting your work page and check out how to update everything that’s requested to be updated and do git pull, just in case.

So here is what I ran in Jupyter hub:

%reload_ext autoreload
%autoreload 2
%matplotlib inline
from fastai.vision import *

! {sys.executable} -m pip install kaggle –upgrade
#(after getting and uploading json file to notebooks directory)

! mkdir -p /.kaggle/
! mv kaggle.json ~/.kaggle/

path = Config.data_path()/‘planet’
path.mkdir(parents=True, exist_ok=True)

(path returned below)


! conda install --yes --prefix {sys.prefix} -c haasad eidl7zip

#Then I made sure all files downloaded into specified directory


#Line above returned:
#[’__MACOSX’, ‘train_v2.csv’, ‘train-jpg.tar.7z’, ‘train_v2.csv.zip’]

#Then running this line in Notebook

! 7za -bd -y -so x {path}/train-jpg.tar.7z | tar xf - -C {path.as_posix()}

Didn’t help, got me an error

#/bin/sh: 1: 7za: not found
#tar: This does not look like a tar archive
#tar: Exiting with failure status due to previous errors

So I decided to SSH into my instance and see if I can do it in Linux. Went to Google Cloud Platform page >> Compute Engine >> VM Instances tab.

My instance was already running, so there I clicked “SSH” under “Connect” column.

In resulting SSH window typed “cd …” which brought me to my home directory.
Then installed p7zip-full by running the following “sudo apt-get install p7zip-full”.

Typed “ls”, gave me a list of directories.
One of them was jupyter.
So navigated to jupyter by typing “cd jupyter”.

Then navigated to where my files were by typing “cd .fastai/data/planet”.
Above address came from:

path = Config.data_path()/'planet'
path.mkdir(parents=True, exist_ok=True)

Typed “ls”, and saw a list of my files.
[’__MACOSX’, ‘train_v2.csv’, ‘train-jpg.tar.7z’, ‘train_v2.csv.zip’]

There I first un-7zipped:

“sudo 7z e train-jpg.tar.7z”

Checked what it looked like by typing “ls”.
Found = train-jpg.tar. Great!

Last step. Untar it!

“sudo tar -xvf train-jpg.tar”

Now all files unzipped into this directory under train-jpg folder.
So in order to see them CD into “train-jpg”. ls…

{path} and {path.as_posix()} works in python as @baz pointed out. These commands are meant to be used from jupyter notebook, and wouldn’t work in terminal. In jupyter notebooks, if you include ‘!’ at the beginning of a line, then it’ll run as a terminal command.

Kaggle has changed file and folder name for this dataset. To download this dataset use this command

! kaggle datasets download nikitarom/planets-dataset

thanks a lot, that’s very helpful.