Lesson 1 discussion

In lesson 1 there was a quick mention of stochastic gradient descent, and a link to a stanford page talking about it. I’ve done my best to understand and summarise the key points on the wiki here - feel free to add to/update/correct the notes. I tried to keep the language pretty plain though - go to the source for more detail!

2 Likes

Thanks @tom - that’s a great idea. I think you can make your page even better by adding some simpler introductory information early on. Rather than talk about how it’s different to other optimization techniques up front, since a lot of folks won’t be familiar with those other techniques, how about trying to jot down a simple explanation first of how SGD works? That would be a great test of your understanding.

For instance, you could refer to, and borrow from, the SGD intro notebook that I showed in class. If you can explain what this notebook is doing, and why, then you’ll have a nice clear explanation of SGD, I think. How does that sound? Let me know if I can help.

1 Like

Also, note that this notebook was used in lesson 2, so the lesson 2 video may be helpful here.

I’m working on predictions for the Kaggle competition, and I’m having problem running the images to test through the prediction step.

At first I was using the new test() method from the vgg16.py, but I was getting a ZeroDivision error. I recognized that the test() function basically wrapped get_batches() and predict_generator together and when I tested get_batches() alone, it didn’t see any the images, and doesn’t generate filenames. I’m pretty sure i’m pointing the function to the directory properly, and I can ls into “unknown” and see the files.

The steps I took to try to run predictions and trouble shoot get_batches() are reading the source vgg.py source code and the Keras documentation for .flow_from_directory() and predict_generator().

I’m pretty sure I’m point get_batches to the correct directory, but is the problem that the test files don’t have a cat/dog flag on the file names? And didn’t setup test() with classes or categories to correctly to accept these unflagged files?

Any help would troubleshooting would be appreciated! Thanks in advance!

3 Likes

I had the a very similar problem and solved it by following along the redux notebook and seeing how it structures the test directory…

I believe that the directory structure should be like this:
test->unknown->*.jpg ...
and vgg.get_batches('test',....)

9 Likes

thanks @vshets & @jeff I’ll try that out! was that info in the keras docs?

Not sure … most likely slack channels or this thread …

thanks for the code snippet! finally got cell to run :slight_smile:

@melissa.fabros for future reference, the docs on how batches are produced is at https://keras.io/preprocessing/image/ - specifically:

flow_from_directory(directory): Takes the path to a directory, and generates batches of augmented/normalized data. Yields batches indefinitely, in an infinite loop.
Arguments:
directory: path to the target directory. It should contain one subdirectory per class, and the subdirectories should contain PNG or JPG images.

It links to this script to show an example: https://gist.github.com/fchollet/0830affa1f7f19fd47b06d4cf89ed44d

4 Likes

Finally getting around to work with the notebooks. AWS setup was great and I can run jupyter notebooks well. All files downloaded and I started working on lesson 1 notebook. I need help debugging a path issue. From lesson1 notebook, when I run the code below, I get a no such file or directory error

# As large as you can, but no larger than 64 is recommended. 
# If you have an older or cheaper GPU, you'll run out of memory, so will have to decrease this.
batch_size=64
# Import our class, and instantiate
from vgg16 import Vgg16# Import our class, and instantiate
from vgg16 import Vgg16
vgg = Vgg16()
# Grab a few images at a time for training and validation.
# NB: They must be in subdirectories named based on their category
batches = vgg.get_batches(path+'train', batch_size=batch_size)
val_batches = vgg.get_batches(path+'valid', batch_size=batch_size*2)
vgg.finetune(batches)
vgg.fit(batches, val_batches, nb_epoch=1)

The error I get is as below

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-7-2b6861506a11> in <module>()
      2 # Grab a few images at a time for training and validation.
      3 # NB: They must be in subdirectories named based on their category
----> 4 batches = vgg.get_batches(path+'train', batch_size=batch_size)
      5 val_batches = vgg.get_batches(path+'valid', batch_size=batch_size*2)
      6 vgg.finetune(batches)

/home/ubuntu/nbs/vgg16.pyc in get_batches(self, path, gen, shuffle, batch_size, class_mode)
     83     def get_batches(self, path, gen=image.ImageDataGenerator(), shuffle=True, batch_size=8, class_mode='categorical'):
     84         return gen.flow_from_directory(path, target_size=(224,224),
---> 85                 class_mode=class_mode, shuffle=shuffle, batch_size=batch_size)
     86 
     87 

/home/ubuntu/anaconda2/lib/python2.7/site-packages/keras/preprocessing/image.pyc in flow_from_directory(self, directory, target_size, color_mode, classes, class_mode, batch_size, shuffle, seed, save_to_dir, save_prefix, save_format)
    288             dim_ordering=self.dim_ordering,
    289             batch_size=batch_size, shuffle=shuffle, seed=seed,
--> 290             save_to_dir=save_to_dir, save_prefix=save_prefix, save_format=save_format)
    291 
    292     def standardize(self, x):

/home/ubuntu/anaconda2/lib/python2.7/site-packages/keras/preprocessing/image.pyc in __init__(self, directory, image_data_generator, target_size, color_mode, dim_ordering, classes, class_mode, batch_size, shuffle, seed, save_to_dir, save_prefix, save_format)
    553         if not classes:
    554             classes = []
--> 555             for subdir in sorted(os.listdir(directory)):
    556                 if os.path.isdir(os.path.join(directory, subdir)):
    557                     classes.append(subdir)

OSError: [Errno 2] No such file or directory: 'data/dogscats/sampletrain'

I setup the directories exactly as @jeremy did in the lesson 1 video and all cell before this code chunk worked well. Need some help to debug where this code is failing.

Vijay

You’re missing a trailing ‘/’ on your sample path. I remember having to fix this in class.

1 Like

gosh! Never mind – my original path was set to data/dogscats/sample instead of data/dogscats/. and yes there is a trailing slash that is missing. Sorry, should have noticed it earlier.

@jeremy I re-visited the first lession video to understand the usage and purpose of sample but I am still not sure.

What I understood is that we can run the training on sample of a few images first to see how accurate our model is before we move on to the large set of images.

However, it doesnt look like a neccessary step and even if I dont do that, I dont understand how it will make my model better.
I also dont see where in our lesson1 notebook we are actually running it over sample data set.

Just a note to help anyone else that gets stuck on the same issue as me: I’m trying to set up Keras/Theano on my local machine (Macbook Air) to do local development, before setting a remote to work on crunching the numbers. I was getting an error when the model was being created (full stack is below):

Exception: The shape of the input to “Flatten” is not fully defined (got (0, 7, 512). Make sure to pass a complete “input_shape” or “batch_input_shape” argument to the first layer in your model.

I’d installed Keras (pip install keras), and told it to use Theano by modifying the “backend” line in ~/.keras/keras.json to “theano” instead of “tensorflow”. I didn’t realise you also have to change the “image_dim_ordering” line as well (from “tf” to “th”):

{
“image_dim_ordering”: “th”,
“epsilon”: 1e-07,
“floatx”: “float32”,
“backend”: “theano”
}

Full error & stack:

(Subtensor{int64}.0, Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0, Subtensor{int64}.0)
(Subtensor{int64}.0, Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0, Subtensor{int64}.0)
(Subtensor{int64}.0, Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0, Subtensor{int64}.0)
(Subtensor{int64}.0, Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0, Subtensor{int64}.0)
(Subtensor{int64}.0, Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0, Subtensor{int64}.0)
(Subtensor{int64}.0, Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0, Subtensor{int64}.0)
(Subtensor{int64}.0, Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0, Subtensor{int64}.0)
(Subtensor{int64}.0, Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0, Subtensor{int64}.0)
(Subtensor{int64}.0, Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0, Subtensor{int64}.0)
(Subtensor{int64}.0, Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0, Subtensor{int64}.0)
(Subtensor{int64}.0, Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0, Subtensor{int64}.0)
(Subtensor{int64}.0, Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0, Subtensor{int64}.0)
(Subtensor{int64}.0, Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0, Subtensor{int64}.0)


Exception Traceback (most recent call last)
in ()
----> 1 vgg = Vgg16()
2 # Grab a few images at a time for training and validation.
3 # NB: They must be in subdirectories named based on their category
4 batches = vgg.get_batches(path+‘train’, batch_size=batch_size)
5 val_batches = vgg.get_batches(path+‘valid’, batch_size=batch_size*2)

/Users/tom/Documents/fast.ai/lesson1/vgg16.py in init(self)
49 print(“Using " + str(K.backend()) + " backend”)
50 # check_processor()
—> 51 self.create()
52 self.get_classes()
53

/Users/tom/Documents/fast.ai/lesson1/vgg16.py in create(self)
97 self.ConvBlock(3, 512)
98
—> 99 model.add(Flatten())
100 self.FCBlock()
101 self.FCBlock()

/Users/tom/.virtualenvs/default/lib/python2.7/site-packages/keras/models.pyc in add(self, layer)
310 output_shapes=[self.outputs[0]._keras_shape])
311 else:
–> 312 output_tensor = layer(self.outputs[0])
313 if type(output_tensor) is list:
314 raise Exception('All layers in a Sequential model ’

/Users/tom/.virtualenvs/default/lib/python2.7/site-packages/keras/engine/topology.pyc in call(self, x, mask)
512 if inbound_layers:
513 # this will call layer.build() if necessary
–> 514 self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
515 input_added = True
516

/Users/tom/.virtualenvs/default/lib/python2.7/site-packages/keras/engine/topology.pyc in add_inbound_node(self, inbound_layers, node_indices, tensor_indices)
570 # creating the node automatically updates self.inbound_nodes
571 # as well as outbound_nodes on inbound layers.
–> 572 Node.create_node(self, inbound_layers, node_indices, tensor_indices)
573
574 def get_output_shape_for(self, input_shape):

/Users/tom/.virtualenvs/default/lib/python2.7/site-packages/keras/engine/topology.pyc in create_node(cls, outbound_layer, inbound_layers, node_indices, tensor_indices)
150 output_masks = to_list(outbound_layer.compute_mask(input_tensors[0], input_masks[0]))
151 # TODO: try to auto-infer shape if exception is raised by get_output_shape_for
–> 152 output_shapes = to_list(outbound_layer.get_output_shape_for(input_shapes[0]))
153 else:
154 output_tensors = to_list(outbound_layer.call(input_tensors, mask=input_masks))

/Users/tom/.virtualenvs/default/lib/python2.7/site-packages/keras/layers/core.pyc in get_output_shape_for(self, input_shape)
400 raise Exception('The shape of the input to “Flatten” '
401 'is not fully defined '
–> 402 '(got ’ + str(input_shape[1:]) + '. '
403 'Make sure to pass a complete “input_shape” '
404 'or “batch_input_shape” argument to the first ’

Exception: The shape of the input to “Flatten” is not fully defined (got (0, 7, 512). Make sure to pass a complete “input_shape” or “batch_input_shape” argument to the first layer in your model.

7 Likes

In some very large datasets including this one from Lesson 1, it takes a very long time to process and fine tune a model based on the original data set size. In our case the 16 layer model is the main reason for long processing times.
One strategy is to take a subset of this original data set (say 1% - 10%) and build models and fine tune them. Once you are satisfied with the results of the metric you had used on sample data, you can then use the final model architecture (with customized hyper parameters) and also the pipeline used to preprocess the sample data and use that on the larger data set.

Yes the step is optional but it is generally a good data science practice to work on sample data sets firsts to iterate through hyperparameter selection stage and data preprocessing.

From below, you have to option to switch to sample data or not.

path = "data/dogscats/"
#path = "data/dogscats/sample/"
3 Likes

Awesome … can you pls write a wiki on the installation for macbooks as that is only one missing so far on the list :slight_smile: http://wiki.fast.ai/index.php/Installation

2 Likes

understood… thank you!

Hey guys,
I’m still on Redux, and I seem to be getting the following error when I train:

Found 25000 images belonging to 2 classes.
Found 1998 images belonging to 3 classes.
Epoch 1/1
24960/25000 [============================>.] - ETA: 0s - loss: 2.6017 - acc: 0.8367
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-13-85f922dc948c> in <module>()
      8 val_batches = vgg.get_batches(path+'valid', batch_size=batch_size*2)
      9 vgg.finetune(batches)
---> 10 vgg.fit(batches, val_batches, nb_epoch=1)

/home/ubuntu/nbs/vgg16.pyc in fit(self, batches, val_batches, nb_epoch)
     97     def fit(self, batches, val_batches, nb_epoch=1):
     98         self.model.fit_generator(batches, samples_per_epoch=batches.nb_sample, nb_epoch=nb_epoch,
---> 99                 validation_data=val_batches, nb_val_samples=val_batches.nb_sample)
    100 

/home/ubuntu/anaconda2/lib/python2.7/site-packages/keras/models.pyc in fit_generator(self, generator, samples_per_epoch, nb_epoch, verbose, callbacks, validation_data, nb_val_samples, class_weight, max_q_size, nb_worker, pickle_safe, **kwargs)
    872                                         max_q_size=max_q_size,
    873                                         nb_worker=nb_worker,
--> 874                                         pickle_safe=pickle_safe)
    875 
    876     def evaluate_generator(self, generator, val_samples, max_q_size=10, nb_worker=1, pickle_safe=False, **kwargs):

/home/ubuntu/anaconda2/lib/python2.7/site-packages/keras/engine/training.pyc in fit_generator(self, generator, samples_per_epoch, nb_epoch, verbose, callbacks, validation_data, nb_val_samples, class_weight, max_q_size, nb_worker, pickle_safe)
   1469                         val_outs = self.evaluate_generator(validation_data,
   1470                                                            nb_val_samples,
-> 1471                                                            max_q_size=max_q_size)
   1472                     else:
   1473                         # no need for try/except because

/home/ubuntu/anaconda2/lib/python2.7/site-packages/keras/engine/training.pyc in evaluate_generator(self, generator, val_samples, max_q_size, nb_worker, pickle_safe)
   1552                                 'or (x, y). Found: ' + str(generator_output))
   1553             try:
-> 1554                 outs = self.test_on_batch(x, y, sample_weight=sample_weight)
   1555             except:
   1556                 _stop.set()

/home/ubuntu/anaconda2/lib/python2.7/site-packages/keras/engine/training.pyc in test_on_batch(self, x, y, sample_weight)
   1251         x, y, sample_weights = self._standardize_user_data(x, y,
   1252                                                            sample_weight=sample_weight,
-> 1253                                                            check_batch_dim=True)
   1254         if self.uses_learning_phase and type(K.learning_phase()) is not int:
   1255             ins = x + y + sample_weights + [0.]

/home/ubuntu/anaconda2/lib/python2.7/site-packages/keras/engine/training.pyc in _standardize_user_data(self, x, y, sample_weight, class_weight, check_batch_dim, batch_size)
    963                                    output_shapes,
    964                                    check_batch_dim=False,
--> 965                                    exception_prefix='model target')
    966         sample_weights = standardize_sample_weights(sample_weight,
    967                                                     self.output_names)

/home/ubuntu/anaconda2/lib/python2.7/site-packages/keras/engine/training.pyc in standardize_input_data(data, names, shapes, check_batch_dim, exception_prefix)
    106                                         ' to have shape ' + str(shapes[i]) +
    107                                         ' but got array with shape ' +
--> 108                                         str(array.shape))
    109     return arrays
    110 

Exception: Error when checking model target: expected dense_20 to have shape (None, 2) but got array with shape (128, 3)

The lines that make this error are as follows:
batch_size=64

from vgg16 import Vgg16
vgg = Vgg16()

batches = vgg.get_batches(path+'train', batch_size=batch_size)
val_batches = vgg.get_batches(path+'valid', batch_size=batch_size*2)
vgg.finetune(batches)
vgg.fit(batches, val_batches, nb_epoch=1)

Personally, I have no idea what’s wrong, so I haven’t done anything to fix it. My data looks good. Under the “data/redux/” path, I have my valid folder with separate cats/dogs folders, each having 1000 images of the corresponding animal. Under my “Sample” folder, I have valid and train folders, each having a separate folder for cats/dogs. Animal folders under train have 51 photos of each, and animal folders under valid have 200 photos of each. Hopefully you can visualize that :laughing:

I am on Mac. I have a p2 instance.
I made my own python notebook and wrote some code with the info from Lesson 1’s notebook. I compared it with Jeremy’s and it looked pretty good. NOTE: I never did the “Create validation set and sample” and “Move to separate dirs for each set”. Instead, I did it all through terminal, which was quite a pain.

I have copied the code from “Run a few more epochs” (Jeremy’s code) and put it under my own Finetune and train, which seems to be the same as his.

Another thing to note is that it says “Found 25000 images belonging to 2 classes.
Found 1998 images belonging to 3 classes.” I have no idea why there are 3 classes. In my “data/redux/valid/” folder, there are 2 folders: cats and dogs, each with 1000 photos of the corresponding animal.

I have not yet implemented any code that deals with submitting.
Also, I can copy-and-paste my full code in if needed. But really, the stuff above is the main bulk of it.

Thanks!
Ethan

1 Like

Hmm … the 3 classes in the valid folder might be the problem. Can you do an ls -l on valid folder and copy paste the print screen of the output here.