Not necessarily, Each model will be specialized to detecting some features.So when you averaging them you are trying to use the expertise of all models.But there maybe some models that are bad and might affect the average. Averaging the predictions is just a navie technique in ensembling. There are better methods like model-stacking , rank-averaging etc.
Vgg and resnet combined seems to give better performance than individual ones
Hi @shushi2000, I tried googlenet, it does not seem to give as good performance as others.We can try other stuff though FCN, resnet-style FCN, inception-style FCN etc.Someone in the kaggle forums posted that clustering the train and test has given them better scores.I need to look in that as well
I think the clustering was especially meant for using the additional data, because in there you have multiple photos of the same person, so you can make sure that a photo of the same person does not end up in both train and valid and/or test. Are you using the additional data?
I used fc_model and trained it with slightly different droupout rates, and scored 0.969. However, I found something really odd: the best score is from an obviously not-well trained model after 8 epoch, like on my validation:
Hi @rashudo, there is a significant leak between test set and additional_train. For me, using 20% of additional_train as validation , seems to correctly represent LB .
Well this is embarrassing but yes the val images are included in the train. Now I am fixing this and see if I can get a better score. Thank you, @rashudo.
Hi @rteja1113, have you tried to submit those 5 sets of predictions separately? I am just wondering if anyone of them itself can get you a better LB score than the average does.
Hi @rteja1113, For now the VGG16 with customized fc layers gave me best score of 0.80 and I want to use ResNet next.
Could you please let me know how you recreate the ResNet model, as Jerome did with VGG16 in the lecture.
Here’s what I have done and how I got stuck:
%matplotlib inline
import utils; reload(utils)
from utils import *
from __future__ import division, print_function
#from resnet50 import *
from keras.applications.resnet50 import ResNet50
from keras.models import Model
from keras.layers import Flatten, Dense, Dropout
from keras.layers.normalization import BatchNormalization
resnet_model=ResNet50()
layers = resnet_model.layers
last_conv_idx = [index for index, layer in enumerate(layers)
if type(layer) is Convolution2D][-1]
last_conv_idx #the output is 177
conv_layers=layers[:last_conv_idx+1]
conv_model = Sequential(conv_layers) #Stuck here...
Error messages like:
TypeError Traceback (most recent call last)
<ipython-input-9-34b747f79c6f> in <module>()
21 conv_layers=layers[:last_conv_idx+1]
22
---> 23 conv_model = Sequential(conv_layers)
/home/shi/anaconda2/lib/python2.7/site-packages/keras/models.pyc in __init__(self, layers, name)
271 if layers:
272 for layer in layers:
--> 273 self.add(layer)
274
275 def add(self, layer):
/home/shi/anaconda2/lib/python2.7/site-packages/keras/models.pyc in add(self, layer)
330 output_shapes=[self.outputs[0]._keras_shape])
331 else:
--> 332 output_tensor = layer(self.outputs[0])
333 if isinstance(output_tensor, list):
334 raise TypeError('All layers in a Sequential model '
/home/shi/anaconda2/lib/python2.7/site-packages/keras/engine/topology.pyc in __call__(self, inputs, mask)
1430 if not isinstance(inputs, list):
1431 raise TypeError('Merge can only be called on a list of tensors, '
-> 1432 'not a single tensor. Received: ' + str(inputs))
1433 if self.built:
1434 raise RuntimeError('A Merge layer cannot be used more than once, '
TypeError: Merge can only be called on a list of tensors, not a single tensor. Received: if{}.0
Hi @shushi2000, I did
size=(448, 448)
res448 = Resnet50(size=size, include_top=False).model
I don’t think you can use Sequential() for resnet because the architecture is not sequential unlike vgg16.There are branches in resnet like explained in Lesson7
Hi @EricPB, I just did Vgg16BN(size=(448,448), include_top=False)
No particular reason why I picked 448.Someone in the forums mentioned that they were using 448.Surprisingly, I was getting comparable results when I use 128 even.
Unfortunately my structure might be different than yours, I use the one from Statefarm Full of @Jeremy
When I enter your input such as:
Import our class
import vgg16bn_p3; reload(vgg16bn_p3)
from vgg16bn_p3 import Vgg16BN
Grab VGG16 and find the last convolutional layer
vgg = Vgg16BN(size=(512,512), include_top=False)
model=vgg.model
last_conv_idx = [i for i,l in enumerate(model.layers) if type(l) is Convolution2D][-1]
conv_layers = model.layers[:last_conv_idx+1]
Build a new model that includes everything up to that last convolutional layer
conv_model = Sequential(conv_layers)
Predict the outputs of that model by calculating the activations of that last convolutional layer