Can you post the link please. Note I am not asking about the Lessons (that I know is on youtube also). I am asking about the Deep Learning I and II lectures on usfa web site. Thanks
The fit_model code looks like this in github
def fit_model(model, batches, val_batches, nb_epoch=1):
model.fit_generator(batches, samples_per_epoch=batches.N, nb_epoch=nb_epoch,
What is batches.N ? That is not defined. I saw some of you mention batches.nb_sample in this forum - I did not get that reference too. Is this a typo ? or I am missing something ? Should that value be the total number of samples in train and validation ?
Also - I am getting strange results when I run the fit_model(model, batches, val_batches, nb_epoch=3) line. Sometimes the val accuracy becomes 1, sometimes it stays fixed at 0.5. I tried changing batch_size and learning rate - but do not get a stable model in any situation.
While writing the VGG model from scratch on keras, i also had a go at writing all the other utility functions myself (such as loading image files). I used Keras’ ImageDataGenerator, which is what is used in the course as well. The method takes a lot of inputs, one of which is a target_size . Since I have not worked with images much before, can anyone tell me a few thumb rules on what would be a good target size?
In the VGG model utility class, we use a target_size of 224x224. Is this just an arbitrary number, or are there ideas grounded in reason behind this? In other words, why not pick 256x256?
I think they are not just random. It’s adjusted for the layers so that it does not require to arrange stride size, padding size and of course performance and memory size.
Running sgd-intro.ipynb on my Mac returned an error when generating the animation. I found that this can be fixed by installing ffmpeg using brew instead of apt-get which is not used on the Mac as follows:
$ brew install ffmpeg
then re-start jupyter notebook
I have noted that packages can exists in two flavours in the same environment each with respect to the way they were installed ‘pip’ or ‘conda’. (conda install -c conda-forge tensorflow’). Then issues arise when importing and running etc. So I remove all duplicated packages and reinstalled from just the one installer.
I ran the notebook sequentially and it died on load_array call with the Python process using about 40GB of memory. I wonder if these notebooks could be improved a bit by deleting some variables along the way.
I am working with a small variation of the Dogs Cats Redux notebook. I am trying to print out the log loss to the screen at the end of each execution but I am getting the following:
I would like to be able to output a log loss value for each execution of the code. Any suggestions would be great.
I’ve been attempting to do the cats vs dogs redux from scratch, but keep getting the error:
Error when checking model input: expected dense_input_1 to have shape (None, 3, 224, 224) but got array with shape (64, 224, 224, 3)
Here’s a link to my script, which is a very minimal network that’s not utilizing VGG (at the moment). Even when I try change the
input_shape parameter to
(224, 224, 3) or
(64, 224, 224, 3) it still doesn’t work (in the first case, it says that the expected shape is
(None, 224, 224, 3) and in the second case, it says that the expected input has 5 dimensions but only 4 are received).
Note: I realize that the output layer should probably have only two neurons, but I haven’t gotten that far yet
A full text of my logs on the latest run (including error) is also provided at the link above.
I’m using floydhub with the options
floyd run --gpu --env theano --data SyccinddLDdS7p3vzcwGQ2 'python demo.py'
/input directory contains the unzipped train and test data.
To clarify, I am running multiple iterations at once from the command line. I would like to be able to store the log loss from each iteration as a variable so I can review it after all of the iterations finish running. If anyone has a suggestion, I would really appreciate the guidance.
My issue is resolved. An answer I received on another forum:
when using fully connected layers, typically you flatten multidimensional arrays into vectors, because by using a FC layer, you’re acknowledging that spatial structure doesn’t matter. Keras is probably expecting a 2-d input [num_examples, example_size]. Since spatial structure probably does matter, you might want to use a convolutional layer instead.
I had the same error: ImportError: No module named tensorflow.examples.tutorials.mnist.
This worked for me (Ubuntu 16.04):
install TensorFlow for Anaconda as per Rachel’s instructions
switch to TensorFlow command line:
ezchx@ezchx-DX4300:~/fastai$ source activate tensorflow
install Jupyter Notebook for TensorFlow
(tensorflow) ezchx@ezchx-DX4300:~/fastai$ pip install jupyter
close all running versions of Jupyter Notebook, start a new command line, switch to TensorFlow command line, and open jupyter notebook from there:
(tensorflow) ezchx@ezchx-DX4300:~/fastai$ jupyter notebook
Please note that if you open Jupyter notebook from a standard / non-TensorFlow command line, TensorFlow will not work.
I also had to install matplotlib and scipy to TensorFlow to run convolution-intro-richie.ipynb.
These links also helped me:
I’m reading through the suggested chapters of Michael Nielsen’s Neural Networks and Deep Learning book, and I’m having trouble understanding the Quadratic Cost function. At a high level I understand that we’re finding the difference between the expected output and the actual output for each given input, then squaring it to accentuate outliers. But I have a couple a questions:
Is the Quadratic Cost function computed against the output of the entire neural network, or is it computed for each layer or node?
Why do we divide by one half?
You don’t see your expected output until the last layer, so the cost function is computed for the output of the entire network. It might be helpful to work through a really simple example initially, like in this picture
For the second question, I think you mean “Why do we divide by 2?” This is just a convention because when you take the derivative, the 2 cancels out.
Hi guys (first post here).
So along with this lesson, to try something a bit different, I tried to build this off of Keras’s built-in VGG16 model.
I essentially took the keras built-in VGG16 model, and did finetuning:
Doing it this way, accuracy was actually reasonably worse. (between ~.958 and ~.965)
Does anybody know why this method yields worse results that the one we’ve built?
One thing I noticed, looking at the source, is they don’t have any Dropout layers. Could that be the reason for such a difference? Or is there something else I didn’t notice / did incorrectly?
For what it’s worth, I also tried a version with Dropout built on top of the above (starting from the ‘flatten’ layer) and the results were just miserable. I suspect I had another issue attempting that.
Oh man, thanks so much for the link! That actually talks about both of the things I had hit when trying the alternative. Perfect! (Though the particular reasons for the differences seem to be not clear, good to have some validation!)