Lesson 2 discussion

luca · February 16, 2017, 2:30am

I also have the same problem. I am running the following

model.fit_generator(batches, nb_epoch=1, samples_per_epoch=batches.nb_sample, validation_data=val_batches, nb_val_samples=val_batches.nb_sample)
and I get:
Epoch 1/1
23000/23000 [==============================] - 631s - loss: 8.0595 - acc: 0.4999 - val_loss: 8.0590 - val_acc: 0.5000

It seems the problem arises when using fit_generator. When I used fit, I got better results

model.fit(trn_data,trn_classes,nb_epoch=2,batch_size=100,validation_data=(val_data,val_classes))

Train on 23000 samples, validate on 2000 samples
Epoch 1/2
23000/23000 [==============================] - 272s - loss: 0.7368 - acc: 0.9508 - val_loss: 0.3512 - val_acc: 0.9765

However, even with this method, I still get a validation loss which is far from the one Jeremy gets when using vgg.finetune and vgg.fit (in Lesson 1)

vgg.fit(batches, val_batches, nb_epoch=2)
Epoch 1/2
23000/23000 [==============================] - 309s - loss: 0.1892 - acc: 0.9695 - val_loss: 0.1737 - val_acc: 0.9760

Not sure what is different between the 2 cases, since they are supposed to do the same. However the validation loss goes from 0.3512 to 0.1737 when using vgg.fit. And indeed I got a better score in Kaggle when using this approach. Would be great if someone has any idea.

rashudo · February 16, 2017, 8:03am

That is very interesting Luca, thanks for the advice. However it didn’t work in my notebook and as an added bonus the original code in lesson2 notebook also stopped working. So maybe there’s something wrong with my installation.

cmeff1 · February 16, 2017, 8:35am

Okay so I think I’m getting a little confused about something here. Hopefully this isnt to silly of a question. The general idea here is: we build a model using our training data to train it and our validation data to verify that our training isnt way out in “left field” so to speak. Once this is handled we still have the “test data” which in the case of the cats and dogs examples are simply pictures that have numbers for names instead of cat.number.jpg. The idea being that we will then pull in this test data run it through our model and it out to generate predictions on that data(is a cat or dog). If any of that is wrong please correct me. The problem I’m sort of having is I dont really see a simple explanation as to how to pull this data in. lines such as vgg.test(parms), which really breaks down into self.model.predict or predict_generator seem to be the point at which you call in your test data to use the model to predict on said data. However, say in lesson two and three, where we are modifying our model to some degree to work on the different layers of the model, in the end I’m still a bit confused as to the structure of all of this. I’ve spent some time playing with it all (mostly working with the sample sets on my local machine) and I’m not really getting results that make much sense to me. I would assume that after we make our modifications to our layers and rerun to train and validate the last step would be to call vgg.test or maybe more specifically model.predict(test_data_path…) @Even allready sort of eluded to this, but it seems strange in the notebooks for lesson2 and 3 that it never really seems clear to me that it gets executed on the test data its self. So maybe I’m misunderstanding the points of those two lessons. If somebody could perhaps explain this more I would appreciate it. I have poked through the forum here and found some limited information but still I’m feeling confused on it. Thank you again ahead of time for help on this.

-Chris

radek · February 16, 2017, 9:31am

train data -> examples of real data we show to our algorithm so that it can learn to do what we ask it to do

validation data -> something we use to gauge what progress we are making, this gives us some sense of how what our algorithm learned on the train data will generalize to examples it has not seen

test data -> in the case of a kaggle competition, data without labels. The training wheels are off and it’s test time - our algorithm is doing the job we taught it to do and we really have no way of telling how well it does since we do not know the labels for the data.

In other applications, you might put some labeled data aside to be your test data. The idea is to use the validation data to fit the model parameters (such as number of layers, type of layers, number of nodes in a layer, etc) using the validation set, and you touch the test set only once after you have finished working on your model. The results on the test set will in general give you a much better indication to how the performance of your algorithm will generalize than the results on validation and train sets (as the parameters of your model have been fitted to perform well on them and the test set is something it has not seen before).

rashudo · February 16, 2017, 12:26pm

I finally found the problem, or better said, problems. I found that having a batch_size of 1 in addition with no shuffling is really bad. I also found that i set the Trainable property of the layers to false instead of the trainable property (lowercase t). If I take care of those two issues I get my normal results again (.975 accuracy).

Even · February 16, 2017, 5:51pm

I think what you’re missing is that the lessons are simply that, lessons and not complete work flows. The process of evaluating and saving the model in the kaggle format is found in:
dogs_cats_redux.ipynb

Even · February 16, 2017, 6:00pm

Got a nice surprise submitting my most recent version of cats vs dogs just squeaking ahead of @jeremy on the leaderboard! Super excited to have made it into the top 10%!

I used a the methods from lesson 3 (no dropout, generative training images) along with a 5-fold cross validation style averaged model setup to get this score.

Looking forward to evaluating my 9-fold that I ran overnight.

Even · February 16, 2017, 6:07pm

I’d recommend going through the lesson 2 notebook again.

The output you’re getting is because you’re running vgg.test() which when you haven’t modified the last few layers is going to output predictions for all 1000 categories of imagenet.

radek · February 16, 2017, 6:15pm

I’m on lesson 2 - lesson 3 is just around the corner

@Even are you finding k-fold validation is helping? Have you had a chance to compare same model with k-fold validation and without?

Setting aside 10% of data for validation with not fully trained model and some clipping I got to ~250 position.

I’m really loving what @rachel & @jeremy put for us together - the course is super fun

Even · February 16, 2017, 8:14pm

It’s a little hard to compare directly because the accuracy of the model affects the best clipping values, which dramatically changes your log loss, but yes 5 fold ensemble took me from position 135 to 78.

My best single model had an average log loss of 0.07329 with a clipping of 0.015, and my 0.06071 score submission had a clipping of 0.075. I’ve fooled around with it a fair bit, but @jeremy’s initial supposition that the clipping should be pretty close to 1-validation accuracy seems to hold.

To give you an idea, at a clipping of 0.015 on my 5-fold ensemble my average log loss was 0.06382 which would have dropped me 10 ranks.

Fully agree with you about the course. It’s such an amazing offering. The learning is really intuitive, and the projects are a lot of fun.

luca · February 16, 2017, 11:48pm

Thanks rashudo, the batch_size of 1 was indeed the problem for me as well! I also wonder whether there is an optimal value for it, or whether 8 is already an optimal choice for this problem

rashudo · February 17, 2017, 5:51am

I wouldn’t know but I think batch_size is more important for speed, with higher being better, as long as it still fits in the GPU memory. In my tests only a batch-size of 1 has negative effects and a batch-size of 2 is already good enough to fix this problem.

geniusgeek · February 21, 2017, 5:26am

ensure that your layer architecture is done properly, every previous layer serves as the input to the next layer, therefore the dimension must align

geniusgeek · February 21, 2017, 5:30am

Your batch size defines how many data it can take to learn at a particular time, If you do use a CPU, a maximum of 4 should do, but for a GPU you should take full advantage of the parallelism and use 32 or 64. NOTE: i tend to see that an increase in batch size may make the models not learn as expected, so you can increase the epoch to 3.

geniusgeek · February 21, 2017, 5:37am

Its not because you used fit generator, its because you increased your number of epoch to 2( nb_epoch=2). So it had more time for training and retraining, but ensure its not cramming. Feed it new data to test this.

himanshu · March 1, 2017, 4:10am

Hi Everyone,
I am doing the assignment of writing vgg16.py from scratch. I am not able to understand the purpose of the following code in finetune() method starting from line 3. Why are we getting the classes here ?. I ran this code and it seems it is sorting the classes in alphabetical order.

def finetune(self, batches):
    self.ft(batches.nb_class)
    classes = list(iter(batches.class_indices))
    for c in batches.class_indices:
        classes[batches.class_indices[c]] = c
    self.classes = classes

run2 · March 1, 2017, 12:56pm

Hey - thats because when classes is initialized in the init method - it has the classes of the imagenet data set. After calling ft we need to set the right classes according to what is in our data (i.e in our batches)

run2 · March 1, 2017, 12:59pm

I was trying to view the video @ https://www.usfca.edu/data-institute/certificates/deep-learning-part-one - but could not connect to the server. Has it been pulled off the site ?

himanshu · March 1, 2017, 1:24pm

Hi,
The number of classes are already being set in the ft() method, then why is there a need to set it in this way?

def ft(self, num):
model = self.model
model.pop()
for layer in model.layers: layer.trainable=False
model.add(Dense(num, activation=‘softmax’))
self.compile()

JRicon · March 1, 2017, 1:33pm

It’s also on youtube, try looking there.