The model now has a LOT over-fitting, what should I do next?

I am working on the Cervix cancer comp and built a sequential model in Keras by mimicking the statefarm-sample.ipynb. After playing with the architecture and some parameter settings, now the model has a lot of over-fitting.

The model looks like this:

mdl = Sequential([
        BatchNormalization(axis=1, input_shape=(3,224,224)),
        Convolution2D(32,3,3, activation='relu'),
        MaxPooling2D((3,3)),
        BatchNormalization(axis=1),
        Convolution2D(64,3,3, activation='relu'),
        MaxPooling2D((3,3)),
        BatchNormalization(axis=1),
        Flatten(),
        Dense(256, activation='relu'),
        BatchNormalization(),
        Dense(128, activation='relu'),
        BatchNormalization(),
        Dense(3, activation='softmax')
])




Epoch 1/10
1185/1185 [==============================] - 404s - loss: 1.4565 - acc: 0.4667 - val_loss: 2.8802 - val_acc: 0.5534
Epoch 2/10
1185/1185 [==============================] - 310s - loss: 0.8629 - acc: 0.6160 - val_loss: 1.2412 - val_acc: 0.5221
Epoch 3/10
1185/1185 [==============================] - 310s - loss: 0.6719 - acc: 0.7418 - val_loss: 0.9542 - val_acc: 0.5187
Epoch 4/10
1185/1185 [==============================] - 308s - loss: 0.4703 - acc: 0.8532 - val_loss: 0.9490 - val_acc: 0.5003
Epoch 5/10
1185/1185 [==============================] - 311s - loss: 0.3258 - acc: 0.9063 - val_loss: 0.8843 - val_acc: 0.5874
Epoch 6/10
1185/1185 [==============================] - 311s - loss: 0.2230 - acc: 0.9485 - val_loss: 0.9492 - val_acc: 0.5133
Epoch 7/10
1185/1185 [==============================] - 311s - loss: 0.1555 - acc: 0.9586 - val_loss: 0.9892 - val_acc: 0.4568
Epoch 8/10
1185/1185 [==============================] - 312s - loss: 0.1106 - acc: 0.9772 - val_loss: 1.0158 - val_acc: 0.5126
Epoch 9/10
1185/1185 [==============================] - 310s - loss: 0.0565 - acc: 0.9932 - val_loss: 0.8905 - val_acc: 0.5513
Epoch 10/10
1185/1185 [==============================] - 313s - loss: 0.0568 - acc: 0.9873 - val_loss: 0.8879 - val_acc: 0.5894

I think this is not bad, since Jeremy said we should always start with over-fitting. However, I am not sure what to do next: should I start to reduce the over-fitting now? I can add regularization in there to see if it helps. But if I do that, the best accuracy will be around 0.6, is that correct?

Or should I play with the model more to increase the val_acc and deal with over-fitting later? If yes, what tricks can I try to increase val_acc? I guess data augmentation is worth trying, but is there anything more than that?

Thank you!

Also I am confused about how to make prediction. The model has these related methods:

'predict',
 'predict_classes',
 'predict_generator',
 'predict_on_batch',
 'predict_proba',

But I had a really difficult time figuring out which one to use and how to use it correctly. The keras documentation is not very clear (to meā€¦) so I think I should ask here.

I have all the test images in the path ā€œdata/test/unknownā€. In the dogs-cats-redux.ipynb, this line is used to make predictions:

batches, preds = vgg.test(test_path, batch_size = batch_size*2)

But my sequential model does not have this ā€˜testā€™ methodā€¦

http://wiki.fast.ai/index.php/Over-fitting
I think the overfitting is due to there being only a few hundred examples of each kind. Which is unlike state-farm.
Of the 6 things mentioned here you could try:

  1. More data: using the auxiliary dataset provided with the competition.
  2. Data Augmentation works well. Shift the pictures provided here around. Change their lighting angle etc.
  3. BatchNorm is already in place.
  4. Use architectures that generalise wellā€¦I think MaxPooling is taking care of that.
    You might not want to change the architecture too much right now because you found one complex enough to model the data.
  5. Regularise : Maybe add dropout.

Hope that helped!

2 Likes

You can run ??vgg.test in a cell in your jupyter notebook, read the definition, and mimic it appropriately for your needs. As far as which ā€˜predictā€™ to use, Iā€™d use trial and error and plenty of ?? for reading code to learn as you go. Good luck to you!

Thanks @mattobrien415. I checked into vgg.test and found the model.predict_generator is the one to use. Just created my very first submission! I appreciate it!