My submission score (currently 1.15881) was also much worse than my
val_loss of 0.1609. So far, I've only fine-tuned Keras' VGG16 model with the Nadam optimizer and a sufficiently low learning rate. I haven't yet applied dropout, ensembling, or pseudo-labels. Nor have I handled the class imbalance issue. Data augmentation, even with tiny increments, did not improve
val_loss for me. A few other things I noticed:
- I again tried Keras' ResNet50 on this Kaggle competition but could not tame it enough to converge to a respectable validation score, despite trying lower and lower learning rates and other optimizers. VGG16 seems to give respectable results relatively quickly. I've not tried other achitectures like Inception, however.
- Despite training on various low learning rates with pre-calculated inputs into VGG16's fully connected layers, I actually got better results training on the entire model (while still freezing the base convolutional layers). Does anyone know why that could happen?
- Note: My software configuration includes CUDA 8.0 with CuDNN 5.1; I created my training / validation split with Scikit Learn's
Since this competition has images that are significantly higher than the 224x224 inputs into the pre-trained ImageNet architectures we're familiar with (e.g., VGG16, ResNet), I can't help but wonder whether we should pre-pend our model with a convolutional layer that accepts a large image size (e.g. 2048x2048) that outputs 224x224 images to VGG16. Is that a worthy approach? I can't find a definitive answer on how pre-trained ImageNet architectures can be used with higher res images.