Setup problems: Running the Lesson 1 Notebook

Hi everybody, I’m having an issue logging into the notebook.

Everything was working fine up up until the end of lesson 1. I was able to get everything setup and running; was at the point where we were unzipping the the cats and dogs files in terminal. However, somewhere during this point, the tmux window in terminal froze and I had to force close the window. I lost my session and wasn’t sure what to do, so I decided to close aws using the amazon web browser: Instances => instance state => stop, instead of using terminal. Now, when I try to log into the notebook: source aws-alias.sh => aws-start, I get the following error:

~ $ aws-start
An error occurred (InvalidParameterCombination) when calling the StartInstances operation: No instances specified

~ $ aws-ssh
ssh: Could not resolve hostname None: nodename nor servname provided, or not known

Any suggestions on how to get the notebook back up and running?

Hi @mitch15, perhaps the $instanceId is missing so aws-start couldn’t work. Try running
~ $ aws-get-p2

OR

~ $ aws-get-t2 (if you’re running t2)

and then try

~ $ aws-start again

2 Likes

Thank you so much @davidtan36, it’s up and running again. Any idea why the $instanceId would go missing like that?

If you see the code for aws-start, alias aws-get-p2='export instanceId=`aws...., you can see that $instanceId is only defined when aws-start is run.

To make it persist, after you run aws-start, take the output, eg if the output is ‘i-9aa9c282’, open ~/.bashrc and add the line ‘export instanceId=i-9aa9c282’ and save it.

Future terminal sessions will have instanceId defined.

1 Like

@davidtan36 I was having a similar problem importing utils. I’m not sure about your particular error, but it was mentioned in another thread that if you comment out “import cv2” and “from keras.callbacks import ReduceLROnPlateau” from the utils.py file it’ll work no problem. They were left in the code from an earlier version.

1 Like

@mitch15 I tried but it doesn’t seem to work. It seems to be an issue specific to bcolz. Thanks anyway!

1 Like

Thanks

SUB: NEED HELP TO SETUP THEANO/KERAS TO USE ALL CORES OF CPU.

Hi,

I intend to use my local windows machine for working on small sample datasets and use the AWS GPU server for final runs in order to keep my AWS costs low.

Finally, I have successfully run Lesson 1 notebook on both Windows and AWS after fixing the initial few errors. On a 1000+ images dataset, my windows laptop executes the VGG.fit() statement in 3 hours Vs the AWS GPU doing it on all 22000 images in about 10 minutes!

My problem is that although the windows machine is a multicore CPU, the python/theano/keras uses only one core. I did set the environment variable OMP_NUM_THREADS=7, hoping that it will make all 7 cores get utilized.

Is there anyway to decrease the time taken on a CPU by making it run parallel on all cores? I understand it will still be 10X the time taken on a GPU, but should be useful for trying code on a sample dataset.

Any suggestions would be appreciated

UPDATE

I finally managed to get it to run in parallel using all available cores. Had to add the following line to the python notebook.

theano.config.openmp = True

1 Like

Hi,

I’m struggling with a home setup. when i try to run vgg.fit() (from the unmodified notebook) i get an error thhat boils down to :

h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2696)()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/work/h5py/_objects.c:2654)()

h5py/h5f.pyx in h5py.h5f.open (/home/ilan/minonda/conda-bld/work/h5py/h5f.c:1942)()

IOError: Unable to open file (Truncated file: eof = 105693184, sblock->base_addr = 0, stored_eoa = 553482496)

It seems it’s trying to load files from a hardcoded path on the original developer’s machine… I suppose there’s a workaround, but does anybody know what it is ?

Hi, I run the lesson1 notebook succesfully, however when training the model, I dont see the validation results. any ideas why does this happen?

#train model
vgg.fit(batches, val_batches, nb_epoch=1,verbose=1)

Epoch 1/1
20936/21000 [============================>.] - ETA: 1s - loss: 0.0960 - acc: 0.9759

Hi iodbh,

Did you donwload the Vgg16.py version in the .zip file in the Lecture 1 notes? It seems to be an outdated version - try using the one in the GitHub repository instead. It fixed the same problem for me.

Btw. Vgg16 is loading from a hardcoded path on www.platform.ai/models/ , the error message is just a confusing one from the h5py library (I think).

Hope this helps!

When I first set up everything on AWS, Jupyter Notebook was working okay. I hadn’t run any of the Lesson 1 notebook but I knew Jupyter was working fine because I ran basic code like 1+1 and importing theano in cells and everything was okay. Now I can’t connect to Jupyter Notebook. I’m getting this message every time I open a notebook

Thanks a lot Mikkel, your post made me realize that the error message was misleading. Turns out the download of the models was interrupted the first time i ran the notebook, which led to the file being corrupted.

If somebody else runs into the same problem, the solution is to clear the keras cache (rm ~/.keras/models/*) then re-run the code. It will download the file again.

4 Likes

I wanted to share a few small changes I had to make to get Lesson 1 to work with the directory structure shown at the top of dogs_cats_redux.ipynb

utils/
    vgg16.py
    utils.py
lesson1/
    redux.ipynb

Initially when running the import code I received the following error:

#Allow relative imports to directories above lesson1/
sys.path.insert(1, os.path.join(sys.path[0], '..'))

#import modules
from utils import *
from vgg16 import Vgg16

ImportError                               Traceback (most recent call last)
<ipython-input-2-2ec9e6c6812a> in <module>()
  1 sys.path.insert(1, os.path.join(sys.path[0], '..'))
  2 
----> 3 from utils import *
  4 from vgg16 import Vgg16
  5 

ImportError: No module named utils

To fix this and some other import errors I did the following.

  • Added an empty __init__.py file to the utils directory

  • Added vgg16bn.py to the utils directory

  • Tweaked the import code above adding utils. prefixes as follows.

    from utils.utils import *
    from utils.vgg16 import Vgg16

As you can probably tell I don’t write much python in my day job. I suspect these steps are obvious to experienced python programmers but hope they will help less experienced python programmers like myself.

4 Likes

Thanks for sharing your solution, @telarson. I had the same problem and this was helpful. But, did you mean init.py instead of input.py?

@agulati, thanks for noticing! I just corrected this.

Hey all! I have managed to debug the image ordering error myself but now am faced with the following error, does anyone have some suggestions as to how it could be fixed?

ValueError: Dimension 0 in both shapes must be equal, but are 3 and 64 for ‘Assign_9’ (op: ‘Assign’) with input shapes: [3,3,224,64], [64,3,3,3].

For the ReduceLROnPlateau error, I first had to upgrade pip

pip install --upgrade pip
then upgrade keras
pip install keras --upgrade
then restart my jupyter notebook

Rachel,

I intend to use the NVIDIA GPU on my Laptop. I have successfully installed theano, keras and other dependencies. I am able to run the Lesson 1 notebook (upto where it was covered in Lesson 1) without any hitch. However, GPU is not being used. As covered in the video, I could find a theanorc file in “C:…\toolkits\keras-1.1.0\docker\theanorc” with the following content:

[global]
floatX = float32
optimizer=None
device = gpu

I also tried setting the THEANO_FLAGS=THEANO_FLAGS_GPU.
The GPU is not being used. Please help me to resolve this.