I’m currently working on the statefarm problem as part of the lesson 3 assignment. I don’t want to look at the kernels on Kaggle as they would give too much away
Started, with the standard VGG model, the last dense layer was removed in favour of a new layer (softmax).
`20224/20224 - 1007s - loss: 11.9468 - acc: 0.2283 - val_loss: 11.2789 - val_acc: 0.2923``
My diagnosis [1]:
- We are under-training, because the val_acc is higher than the training acc
- Model probably doesn’t have enough features to learn the much more complicated images of statefarm
Working on my assumption of undertraining, I tried to add multiple additional dense layers to see if the validation would get any better. But this didn’t improve things. I also then tried to set all dense layers to trainable (vs just the last one) and a learning rate of 0.01.
This did not improve things either:
`20224/20224 - loss: 11.6032 - acc: 0.2726 - val_loss: 11.3431 - val_acc: 0.2932
In addition:
- Had a hypothesis that maybe the image variation between different drivers / setups was too high, so I tried both sample-wise and feature-wise normalisation (via Keras), which made results even worse.
- I tried various learning rates, incase the changes were not dramatic enough - but very high learning rates seemed to make the problem worse.
My question, then, is:
a) Is broadly my intuition about what is going wrong [1] mostly right?
b) I don’t have enough experience to have an feeling for where I should be looking to debug something like this, where I’ve sort of become stuck in a rut. I’ve (attempted) to made dramatic changes to the network without much success. Should start again with a simple linear classifier to set expectations? Look at image pre-processing? Try to make the model more complex or try to make it more simple?