Recommendations for dealing with underfitting for high resolution image classification

Hi All,

I greatly enjoyed the course and when tried to applied the methods to my data, I simply can’t ‘break the glass ceiling’. My data is a binary classification problem with the balanced collection of 17,000 images (8,500 images in each class). The images are large (1024x1024 pxls).

I decided to try the ResNet152 architecture and started with small (rescaled) images of 256 and then 512. In both cases I achieved nearly 98% of accuracy. However, when I switched to 1024 resolution, I simply can’t go above 92%, regardless of anything I’m trying.

First, I started with training the head only, and when I tried both, gradual unfreezing and the complete unfreezing, I can’t jump above 93%.

Both, training and validation losses are of the same order and I can’t even overfit. I tried up to 100 epochs with various (discriminative) learning rates, nothing helps.

Here is an example of trying to overfit the final classification layers:

Fully unfrozen network, trained for 50 epochs yields a very similar results: train/valid loss are within the range 0.55-0.37. I tried to remove data augmentations and complete set the drop out to zero, no significant change.

Any ideas for a beginner like me? Thanks!