Lesson 1 discussion

Koby · March 28, 2017, 10:19am

Here is something annoying I was struggling with in the last hour and now I solved, I’m posting it here in case someone has the same issue:

In notebook1 we are doing this transformation in the first layer:

vgg_mean = np.array([123.68, 116.779, 103.939]).reshape((3,1,1))
x = (x - vgg_mean)
return x[:, ::-1]

notice that vgg_mean is created with default dtype (float64). In my .theanorc the default dtype is float32:

[global]
device=gpu0
floatX=float32

So unless there’s a casting, you will ran into this error when you add the Dense layer (for some reason the convolutional layers don’t ignore the dtype):

ValueError: Input 0 is incompatible with layer dense_8: expected dtype=float32, found dtype=float64

In our case the casting is done by the ZeroPadding layer. When I tried to code it without it (using border_mode=‘same’ instead) it didn’t work).

Solution:
vgg_mean needs to be created with an explicit dtype:

vgg_mean = np.array([123.68, 116.779, 103.939], dtype=np.float32).reshape((3,1,1))

Estiui · March 29, 2017, 10:50am

I’ve finally set the whole thing up to run into my local machine, it seems to be working ok.

The question I have might be silly, and I believe it’s not important at all, but I can’t figure it out. In the following lines of code:

batches = vgg.get_batches(path+‘train’, batch_size=batch_size)
val_batches = vgg.get_batches(path+‘valid’, batch_size=batch_size*2)

Why is the batch_size multiplied by 2 for the validation samples? I tried replicating the code and not multiplying by 2 and it works fine, but I can’t think of a reason to do that right now.

Also, about the second part of the lesson on building the model from scratch with Keras, are we supposed to understand it all right now? I understood quite well the first part using the vgg wrapper, but totally lost the point on the second part with Keras.

Anyway, the course looks great, let’s do it!

prsahu · March 29, 2017, 1:40pm

I am getting the following error while trying to run the 7 line custom model.
I have installed all the required libraries and configured Theano and Keras properly as well based on the given instruction. I am running this on a local Ubunut 14.04, NVIDIA 759Ti, 16GB, i7 machine.
Please help me with this, as I unable to proceed ahead because of this roadblock.

NotImplementedError Traceback (most recent call last)
in ()
----> 1 vgg = Vgg16()
2 # Grab a few images at a time for training and validation.
3 # NB: They must be in subdirectories named based on their category
4 print(path+‘train’)
5 batches = vgg.get_batches(path+‘train’, batch_size=batch_size)

/home/prsahu/Downloads/courses-master/deeplearning1/nbs/vgg16.pyc in init(self)
30 def init(self):
31 self.FILE_PATH = ‘http://www.platform.ai/models/’
—> 32 self.create()
33 self.get_classes()
34

/home/prsahu/Downloads/courses-master/deeplearning1/nbs/vgg16.pyc in create(self)
65 def create(self):
66 model = self.model = Sequential()
—> 67 model.add(Lambda(vgg_preprocess, input_shape=(3,224,224), output_shape=(3,224,224)))
68
69 self.ConvBlock(2, 64)

/home/prsahu/anaconda2/envs/fastAI/lib/python2.7/site-packages/keras/models.pyc in add(self, layer)
278 else:
279 input_dtype = None
–> 280 layer.create_input_layer(batch_input_shape, input_dtype)
281
282 if len(layer.inbound_nodes) != 1:

/home/prsahu/anaconda2/envs/fastAI/lib/python2.7/site-packages/keras/engine/topology.pyc in create_input_layer(self, batch_input_shape, input_dtype, name)
368 # and create the node connecting the current layer
369 # to the input layer we just created.
–> 370 self(x)
371
372 def assert_input_compatibility(self, input):

/home/prsahu/anaconda2/envs/fastAI/lib/python2.7/site-packages/keras/engine/topology.pyc in call(self, x, mask)
512 if inbound_layers:
513 # this will call layer.build() if necessary
–> 514 self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
515 input_added = True
516

/home/prsahu/anaconda2/envs/fastAI/lib/python2.7/site-packages/keras/engine/topology.pyc in add_inbound_node(self, inbound_layers, node_indices, tensor_indices)
570 # creating the node automatically updates self.inbound_nodes
571 # as well as outbound_nodes on inbound layers.
–> 572 Node.create_node(self, inbound_layers, node_indices, tensor_indices)
573
574 def get_output_shape_for(self, input_shape):

/home/prsahu/anaconda2/envs/fastAI/lib/python2.7/site-packages/keras/engine/topology.pyc in create_node(cls, outbound_layer, inbound_layers, node_indices, tensor_indices)
147
148 if len(input_tensors) == 1:
–> 149 output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
150 output_masks = to_list(outbound_layer.compute_mask(input_tensors[0], input_masks[0]))
151 # TODO: try to auto-infer shape if exception is raised by get_output_shape_for

/home/prsahu/anaconda2/envs/fastAI/lib/python2.7/site-packages/keras/layers/core.pyc in call(self, x, mask)
554 if ‘mask’ in arg_spec.args:
555 arguments[‘mask’] = mask
–> 556 return self.function(x, **arguments)
557
558 def get_config(self):

/home/prsahu/Downloads/courses-master/deeplearning1/nbs/vgg16.pyc in vgg_preprocess(x)
21 def vgg_preprocess(x):
22 x = x - vgg_mean
—> 23 return x[:, ::-1] # reverse axis rgb->bgr
24
25

/home/prsahu/anaconda2/envs/fastAI/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.pyc in _SliceHelper(tensor, slice_spec)
305 if s.step not in (None, 1):
306 raise NotImplementedError(
–> 307 “Steps other than 1 are not currently supported”)
308 start = s.start if s.start is not None else 0
309 if start < 0:

NotImplementedError: Steps other than 1 are not currently supported

saiprasanna · March 31, 2017, 10:44am

Why is the batch_size multiplied by 2 for the validation samples? I tried replicating the code and not multiplying by 2 and it works fine, but I can’t think of a reason to do that right now.

“Because it doesn’t need backprop, so needs less memory.” - Jermey

Increasing batch size while training will require lots of memory, but during validation we can validate more images with same memory.

No, how to use Keras directly is explained in the second week video.

saiprasanna · March 31, 2017, 10:53am

This is a bug in the keras printing invalid characters for status bar. @jeremy

https://github.com/fchollet/keras/issues/5906

To fix this temporarily, in vgg16.py, we have to change verbosity level for fit_generator function to just log the accuracy data once per epoch. verbose=2

Change

def fit(self, batches, val_batches, nb_epoch=1):
    self.model.fit_generator(batches, samples_per_epoch=batches.nb_sample, nb_epoch=nb_epoch,
            validation_data=val_batches, nb_val_samples=val_batches.nb_sample)

To

def fit(self, batches, val_batches, nb_epoch=1):
    self.model.fit_generator(batches, samples_per_epoch=batches.nb_sample, nb_epoch=nb_epoch,
            validation_data=val_batches, nb_val_samples=val_batches.nb_sample,verbose=2)

Estiui · March 31, 2017, 10:56am

Thank you very much for your answer!!

nzaker · March 31, 2017, 4:43pm

Nice thing about this vgg model for classification is that it can be applied for any kind of classification problem as far as you can change the data into image. I applied it on some business data, and converted table aggregated data (each row) to an image. It worked perfectly for classification and outperformed standard ML models.

saiprasanna · March 31, 2017, 5:05pm

Whoa!!!, but how?

Isn’t the vgg model trained to recognize shapes in a image?

@jeremy can VGG model be used for transfer learning non image data?

nzaker · March 31, 2017, 9:02pm

You should be able to find some local relationship about your data. These models are sensitive to local positions/relationship. simple way, put columns of table with close relationships (meaning, date) close to each other … I went into some pre-processing to change table aggregate data into image which should be done wisely …

saiprasanna · April 1, 2017, 3:59am

So I have to engineer the feature columns to have spatial relation matching their actual relationships.Sounds intriguing.

Is it really better than other approaches though? And is there any paper relevant to this type of transfer learning?

And finally, how did you encode your data to an image?

nzaker · April 1, 2017, 6:07pm

I am trying couple of other methods to compare my results, then I can give you a clearer answer.
Regarding the data that I have, I have not found a paper yet and maybe I decide to submit my final results for a conference if I get good comparison results between methods. @jeremy have you seen anyone applied this approach?
last question is a good one. It is very related to your data that how you want to order your table columns … you can convert a table row to a matrix and scale the numbers (which are usually previously normalized to be between 0 and 1), to be between 0 and 255. Then convert that matrix to an image. The conversion is very straight forward in python.

jeremy · April 1, 2017, 8:07pm

No! Sounds intriguing. So you color coded the cells of the table based on the magnitude of the data? I’d love to see a thorough comparison. We’ll be looking at structured data in a couple of weeks so maybe you’ll have some more ideas to compare to then.

JTpet · April 3, 2017, 5:57pm

Im having trouble at a very early on step in the video.

I’ve downloaded cygwin and did ‘pip install awscli’. Everything is fine up to this point. Then I type in ‘aws’ and it says

C:\Users\Jtpet\Anaconda2\python.exe: can’t open file ‘/cygdrive/c/Users/Jtpet/Anaconda2/Scripts/aws’: [Errno 2] No such file or directory

I found this on stackoverflow and tried the first answer. But when I type ‘pip uninstall awscli’ it just sort of freezes or something. I never get a prompt ($) again and have to force cancel it. I’m not sure what to do at this point. I uninstalled and reinstalled Cygwin, but still no luck. I’ve edited my environment variables and added this:

C:\Users\home\Jtpet\Anaconda2\Scripts

but still no luck. Python.exe is found here:

C:\Users\Jtpet\Anaconda2

and AWS is here:

C:\Users\Jtpet\Anaconda2\Scripts

What am I missing?

Manoj · April 3, 2017, 7:05pm

@JTpet, you shouldn’t be using pip from the anaconda. Install pip in cygwin, then install awscli.

JTpet · April 3, 2017, 7:31pm

Thanks for the response @Manoj. I havent done anything in Anaconda yet, this is all in Cygwin. I followed your advice and found the following advice from stackoverflow. Following the advice there, I tried to figure out what version of Python I have (it is my understanding that Python is installed when you download Cygwin?). I typed ‘python’ into Cygwin, and nothing happened. I cancelled that, but am still at a loss of what to do. How do I find out whether I have python 3 or python 2 specific version? I think I have to know before I download the appropriate pip.

edit:

Jtpet@DESKTOP-DQS100P ~
$ python3 -m ensurepip
-bash: python3: command not found

Still no luck.

edit2: For posterity. You have to type “python aws”.

oren01 · April 9, 2017, 8:55pm

Thanks Carlos for your help, I was able to sort of get it to work.
But I was very exhausted by the whole business and decided to give up on this course for a while, and watch some other online courses.
Unfortunately, this didn’t help at all, and when I came back and re-read the lesson 1 notebook today I was even more confused by various points he made, which seemed to contradict or be different than what I understood from other sources. I tried watching and reading lesson 2 but got stuck there too on some basic stuff which everyone seems to understand except me… such as what’s exactly a validation set is needed for if we don’t actually seem to use it, and if it’s used internally then how exactly it’s used, and what exactly do we do with the sample, that is, how do we know the result from the sample is any good, what’s exactly the criteria, and why isn’t it used in lesson 2 after he create it there, why do the data set from Kaggle already contain training and test (I expected it to contain only labelled data), and even if it does, why doesn’t it contain a validation set… This is all very confusing and there seems to be no clear and cut algorithm which explains the inputs and outputs of each stage and the complete rationale behind each one. I must be missing something very basic here, I think I might give up on this course…

ale · April 13, 2017, 12:47am

Since my dataset doesnt look anything like any of the classes of Imagenet I wanted to train from scratch the network with it.
I changed layer.trainable=False to True. Is that all I have to do?
Because the accurancy gets stuck on 0.50, its like the model isnt learning anything. What could be wrong?

borowis · April 13, 2017, 10:39pm

Hi, there! I can’t submit to dogs-vs-cats competition, kg cli fails with ‘NoneType’ object has no attribute ‘find’. I’m pretty sure I configured it correctly because I’m able to download the data through kg download. There’s no active ‘Submit’ button on kaggle website either. Any advice and/or confirmation?

simoneva · April 14, 2017, 10:25am

I already reported this as a bug. In the meantime there is a submit button on the kaggle website if you are logged in. So you just have to download the data to your laptop and submit from there.

borowis · April 14, 2017, 6:14pm

thanks for your reply! But I’ve just checked and there’s not a submit button for dogs-vs-cats competition. So I skipped that one and instead did dogs-vs-cats-redux-kernels-edition which is similar, I guess only evaluation is different (accuracy vs log. loss). For that one there’s a button on website and kg cli worked for me as well.