Lesson 2 discussion

complancoder · January 2, 2017, 9:55pm

@jeremy Thanks for the pointers.

I made a notebook which has the visualizations after each convolution step. If anyone wants to see it, please check out the notebook in my github repo . The images becomes too small at around 20th step and becomes difficult to figure out as to what exactly it is trying to do.

It would be nice if somebody could help me figure out images maximally activating the neuron.

aha · January 3, 2017, 1:10am

That’s really cool!

jeremy · January 4, 2017, 2:44am

@complancoder very nice

You all may be interested in following this tutorial: https://github.com/Lasagne/Recipes/blob/master/examples/Saliency%20Maps%20and%20Guided%20Backpropagation.ipynb

gmedasani · January 5, 2017, 4:38am

Hi, As I was watching the lesson2 videos @jeremy mentioned when we precompute the features from convolutional layers, it prevents us from doing data augmentation. A similar note is mentioned in the Keras blog post.

https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

Can someone explain why this is the case?

Best Regards,
Guru Medasani

jeremy · January 5, 2017, 4:39am

With data augmentation, every image is randomly changed in every batch. So the features are different in every batch. So you can’t usefully precompute something that always changes!

gmedasani · January 5, 2017, 4:49am

Thanks Jeremy. So is this limitation specific to the data augmentation ImageGenerator in Keras? When I think of data augmentation, it can also be done prior to training sometimes, right?

If I were given 1000 images, I can apply some image pre-processing like shifting, rotating, rescale, zoom on the images and create may be additional 4000 images for example. This will bring my training set to 5000 images which I can then use to pre-compute the convolutional layers. If I’ve some process that can do this, will it then cause any problems with pre-computing convolutional layers?

jeremy · January 5, 2017, 6:01pm

Yup you can do that - and indeed in the course I show how to do that with keras. However it’s not as good as having truly random augmentation, since you’ve limited your dataset size.

gmedasani · January 5, 2017, 8:07pm

Got it. Thanks for clarifying.

VishnuSubramanian · January 6, 2017, 12:11am

HI I was going through the notes of Lesson 2. I am trying to understand how the python implementation of log_loss works in the link http://wiki.fast.ai/index.php/Log_Loss. I tested the 2nd line of code “p = max(min(predicted, eps), eps)” which always returns the value of eps. The log loss returned by this function also remains constant when actual variable is 1 and 0. Am I missing something.

rachel · January 6, 2017, 3:48pm

@VishnuSubramanian Good catch, that is a typo. The formula should be p = max(min(predicted, 1 - eps), eps). This line is to clip the probability (since log-loss is undefined for p=0,1)

In practice we use the numpy method np.clip, so I’ve updated the wiki to use p = np.clip(predicted, eps, 1 - eps) instead

VishnuSubramanian · January 7, 2017, 5:34am

Thank you @rachel

Jonas · January 7, 2017, 7:08pm

I don’t know if it was already mentioned elsewhere, but there are YouTube videos corresponding to the cs231 lectures (this weeks reading) for those, like myself, who prefer listening:
CS231n Winter 2016 Lecture 1 Introduction and Historical Context
CS231n Winter 2016 Lecture 3 Linear Classification 2, Optimization
CS231n Winter 2016 Lecture 4 Backpropagation, Neural Networks

Update: and here syllabus / slides:
CS231n: Convolutional Neural Networks for Visual Recognition

kelin-christi · January 8, 2017, 6:30pm

Hi @jeremy, I don’t know what is causing this error, but lately when I am trying to run fit_generator, I am getting a weirdly formatted summary. This is what it looks like.

This is me running lm.fit_generator() on the MNIST dataset for the linear model.

I’d appreciate the help!

Jonas · January 8, 2017, 8:48pm

Hi @kelin-christi, I am not an expert, but just looking at your screendump to me it looks like an issue of your Jupyter notebook configuration. I googled for “jupyter notebook line wrap” and it seems like you need to change your configuration file “custom.js”, but it is not clear from the posts where this file is located, I found several versions on my system. Does this only happen by “fit_generator” method, or is it a general probem of the output? Regards, Jonas

kelin-christi · January 8, 2017, 9:05pm

Jonas, thanks a lot for your reply. I figured it was a jupyter issue, so I just started a new p2 instance for the time being, and everything is running smoothly. I will however try to figure out what the issue with my previous instance was by medling with some jupyter settings/updating files. Thanks again

chaseos · January 8, 2017, 10:03pm

I’ve been working through lecture and the dogs_cats_redux notebook on the git and I am having trouble getting the vgg16 model to improve on subsequent epochs. It appears to me that it is “starting over” on each epoch instead of building upon the weights learned in the previous epoch. I have tried reducing the learning rate as suggested by Jeremy. Any thoughts?

The image shows my second and third epochs. The training loss and accuracy was similar for the first epoch (ft0.h5) as well. I’m using a NC6 instance on Azure.

jeremy · January 9, 2017, 9:34pm

Try a lower learning rate.

chaseos · January 10, 2017, 5:43am

0.0001 did the trick! Thanks!

Jonas · January 10, 2017, 1:01pm

Here just another tip for the Lesson 2 readings. Backpropagation is not so easy to grasp, and I find Nielsens explanation good, but Andrew Ng in his Machine Learning class manages to explain it better in my opinion. So I just want to recommend following videos:
Ng: Machine learning - Backpropagation Algorithm
Ng: Machine learning - Backpropagation Intuition

cmeff1 · January 10, 2017, 3:00pm

Been working through the notebook for lesson 2. Ran into some error messages in regards to memory when running lines that call get_data. As suggested above I switched to get_batches. Well when you try to run trn_data.shape I get the following error message: AttributeError: ‘DirectoryIterator’ object has no attribute ‘shape’ So I checked out the get_data function and it uses the get_batches and then the return statement appears to be concatenating all of the batches. Thats the line that I believe is causing my memory issues. Below you will see the code I used to sort of play and isolate what I believe is my issue. Has anybody else had that error or might have some suggestions as how to over come it? Thank you.

PS I’m running this on my own machine NVIDIA GTX 1080 8gigs

val_data = get_batches(path+‘valid’, shuffle=False, batch_size=1, class_mode=None, target_size=(224,224)) #this runs fine.

val_data = np.concatenate([val_data_a.next() for i in range(val_data_a.nb_sample)]) #this runs fine small amount of images to process

Found 2000 images belonging to 2 classes.

trn_data = get_batches(path+‘train’, shuffle=False, batch_size=1, class_mode=None, target_size=(224,224)) # this runs fine

#trn_data = np.concatenate([trn_data_a.next() for i in range(trn_data_a.nb_sample)]) # this blows out my memory

Found 23000 images belonging to 2 classes.

trn_data.shape

AttributeError Traceback (most recent call last)
in ()
----> 1 trn_data.shape

AttributeError: ‘DirectoryIterator’ object has no attribute ‘shape’