Lesson 1 discussion

Thanks

I saw few mentions about trying in t2.large or non GPU instances for the exercises. I would like to know how is the general recommendation about using a non-GPU instance for the exercises. I am currently using the Kaggle docker image for python.

Okay I found the setup_t2.sh file. Happy Finding :slight_smile:

I have te same massage, have you sorted it?

Hi Elle, I am also using Python 3, which has replace cPickle with _pickle.

So replace that line with this one, and it will work:

import _pickle as pickle

Hope this helps,
Christina

2 Likes

Is anyone having the same issues as me achieving the kind of validation accuracies that are shown in the notebook for lesson 1? The best I am able to get is ~ 0.92, not >97% it says we should get out of the box. That is using the VGG model and weights.

I am running on a Windows 7 machine with a GTX960 GPU with 4 GB RAM and a Tensorflow back end.

I am limited to a batch size of 40 because of the 4 GB memory limitation on the GPU.

What am I doing wrong?

Thanks, Christina

How is that possible that you are running on tensorflow stead of theano?

Keras supports both tf and th. Recently, gpu supported tensorflow is released for windows.

I do know that, but tried to run course files on tensorflow backend in ubuntu enviroment, and that did not worked out, so was courous how she managed that

Well at least for the lesson 1 nb it works with minor modifications (mainly the utils import part). For other lessons, I didn’t tested yet on windows-tf.

Trying to get my Vgg up and running. I trained it with the cats and dogs pictures in my train folder. They are properly separated. Then I run a sample on a batch of 4 (turns out they are all dogs). I then run vgg.predict and it is outputting goldfish (see below). Why is it doing this?

vgg.predict(imgs, True)
   Out[24]:
   (array([ 0.9999,  1.    ,  1.    ,  1.    ], dtype=float32),
    array([1, 1, 1, 1]),
    [u'goldfish', u'goldfish', u'goldfish', u'goldfish'])
    The category indexes are based on the ordering of categories used in the VGG model - e.g here are the first four:
In [18]:

vgg.classes[:4]
Out[18]:
[u'tench', u'goldfish', u'great_white_shark', u'tiger_shark']
(Note that, other than creating the Vgg16 object, none of these steps are necessary to build a model; they are just showing how to use the class to view imagenet predictions.)

can you therfore you modifications? it sounds intresting, did not tray that, and tensor in my opinion is much more effective

@altmbr It sounds like you haven’t run finetune yet and you’re currently outputting the direct results of vgg. Are you certain you’ve called vgg.finetune(batches) and vgg.fit(batches, val_batches, nb_epoch=1) ?

Hi Maciej,
It is really easy to run tensorflow rather than theano under the hood. After I set up tensorflow-gpu on my Windows 7 machine, the only modification I had to make was to change the keras.json file to look like this:

{
“image_dim_ordering”: “th”,
“epsilon”: 1e-07,
“floatx”: “float32”,
“backend”: “tensorflow”
}

This is because the way theano addresses the dimension ordering is opposite of tensorflow. The other way to do it would be to go through the code and restructure it wherever it was passed, but this was easier because I could just run the course notebooks and code out of the box.

Tensorflow-gpu only supports Python 3.5 on windows, so I am also using that instead of 2.7 – I had to make a few changes to the provided course material for that, but nothing major.

I also got Jupyter notebooks and even Spyder to run on Windows 7.

If anyone is interested, let me know and I’ll post the setup steps here.

1 Like

I did the vgg.finetune and vgg.fit functions but still don’t get the 97% the lesson 1 notebook says we should be getting.

I am going to set up an AWS P2 instance today with Theano and run the same notebook there. I want to compare the results of AWS/Theano with what I am getting on Windows 7 and Tensorflow.

I want to move onto lesson 2, but I want to get this nailed down first. I completed everything else in lesson 1.

Thanks Tom. Will try that next.

I feel like I’m missing something about how to work through this course. I watched the video on youtube and worked through the lesson1 notebook. I was then trying to figure out how to run the classification on images in the test1 folder. But really couldn’t figure it out at all.

After a few forum search on “how to predict” I came across some vgg.test() examples, but I can’t find where that is explained. Is there any documentation of the methods for vgg? I would have thought vgg.predict() was the thing I needed, but evidently not. Is this an example of the “learn it on your own” approach where you just randomly try code?

After a bit more poking around I found that there is some code in the dogs_cats_redux notebook that seems to walk through that. Is this the “learn it on your own” approach where I just need to randomly open and read stuff until I figure it out, or is there a page or a video that I’m missing which kinda lays this out?

Between the youtube video, wiki, forum, and different updated versions of the resource I’m having a really hard time following all of this. Thanks for any clarity that you can provide.

2 Likes

Rothrock,
I started watching lesson 2, and some of the stuff you’re asking is covered in there. He mentions in that video that not everything is explained up front for a reason. He wants you to dive right in head first, then figure out what questions to ask. Yes, I felt the same way as you do, but I understand the reasoning now. I am learning a LOT so far!

Christina

1 Like

Okay – I got my AWS instance up and running with Theano today. And ran the lesson 1 notebook side by side on both machines - Win7/Tensorflow/GTX960 and AWS/Ubuntu/Theano/K80!

The results were amazing. Same notebook, same batch size of 40, all else equal except for the GPU and the back end!

Win7/Tensorflow/GTX960: 1256s - loss: 0.4515 - acc: 0.8717 - val_loss: 0.2228 - val_acc: 0.9215

AWS/Ubuntu/Theano/K80: 645s - loss: 0.1261 - acc: 0.9685 - val_loss: 0.0721 - val_acc: 0.9825

By the way, according to the NVidia website, the GTX960 has a compute capability of 5.2, while the K80 has a compute capability of 3.7. Does that make sense???

Why would Tensorflow perform that much worse than Theano? Now I will have to install Theano on my Win7 machine, and Tensorflow on the AWS instance! And see what’s really going on… :confused:

Christina

I have the same issue, but upgrading Theano (to v0.9.0b1) doesn’t seem to have helped me.
Using TensorFlow instead of Theano has (different) errors too.

I suspect this is because I’m running everything inside a docker container on OSX though. If I ever fix it, I’ll post again.