Accuracy and input image size

Hi, practicing a lesson I tried to train a cnn_learner and predict if a car was one of (sports, suv, mini van) and I got very odd error_rates variation with input size:

  • (20%) with input image size of 128, (I even tried unfreezing and using slice(1e-7, 1e-4) for fit())
  • (6%) with input image size of 224 (only fine-tuned) and
  • (15%) with input image size of 512 (after unfreeze and fit(5, slice(1e-7,1e-4))

Can anyone explain why that is? I assumed it could have to do with the original training input dimensions of the resnet18 whose parameters were being used but I’ve found posts stating that it did not matter what dimensions it was trained on!

Hi Atif,
I don’t think we can definitively explain it based on this data only - one hypothesis could be that you go from underfitting (size 128), via a reasonable fit (224) to overfitting (size 512), but there may be many other factors influencing this. You should look at both training and validation error and loss to verify this. Also, it will probably help if you look at confusion matrices, most_confused images etc. to try to understand where your models are making errors.
Good luck! Darek