Difference between the following

What is the difference between,
val_features = conv_model.predict_generator(val_batches, val_batches.nb_sample) and

trn = get_data(path+‘train’). I know this one fetches all the data a creates a dataframe.
What does the first one do?

The predict_generator is used to make predictions from an Imagegenerator object. You can read the details at: https://keras.io/models/sequential/

1 Like

Then what is the difference between this prediction,
ll_val_feat = model.predict_generator(val_batches, val_batches.nb_sample) and
the final prediction that we are ensembling?

the prediction returned by the predict generator identifies the probability of the object from the 1000 sets of object innitially trained from ImageNet. We then pick this probabilities and find the maximum, ensemble to form our probability of dog(1-prob of cat). We save and submit to kaggle.

How does it makes sense to predict from 1000 classes? Shouldn’t we be predicting strictly dogs vs cats?

@sakiran you are right, Each model of the ensemble predicts the probabilities of dog vs cat (if you correctly modified the vgg model by pop of the last layer and adding a dense layer with two categories), the ensemble takes the average of all of them and makes a submission. The benefit of the ensemble is like asking different people (models) for their opinion (probs) and taking the average…

Then why are we predicting for 1000 classes?
ll_val_feat = model.predict_generator(val_batches, val_batches.nb_sample)

As the notebook says that line of code is creating the “Finally, we can precompute the output of all but the last dropout and dense layers, for creating the first stage of the model:” and in the previous lines you have two model.pop() that have removed the last two layers so “model” no longer predicts 1000 categories. You use that line to make the predictions for all images up to just before the last dense layer , then you create the dense layer with two categories to go from there to cat or dog

Just to add Angel’s answer, below is the finetune function

 def finetune(self, batches):
    model = self.model 
    # Remove the last layer
    model.pop()  
   # Make all inner layers non-trainable i.e keep weights same
    for layer in model.layers: layer.trainable=False 
   # Add a layer with number of classes = classes in the batch
    model.add(Dense(batches.nb_class, activation='softmax'))
    self.compile()